Recently in google Category

Can we reverse engineer Google’s word correction algorithm given a corpus of misspelled words paired with their corrections?

Since I have a single word domain name mischievous, which is one of the 100 most misspelled English words, this allows me to analyze some interesting data from Google’s webmaster tools. I pulled out all the misspellings and impressions within a Levenshtein Distance. There is a nice academic paper that discusses Learning a Spelling Error Model from Search Query Logs that I plan to use to explore some of this data in the future.

A chart and regression of the misspelling data on a log-log chart shows that impressions of misspellings of the word mischievous vs the rank that they appear in all keywords that lead to this blog follows Zipf’s_law. I refitted words with under 10 impressions based on their rank data (ranks >= 83) as webmaster tools only gives a sample value when the impressions are greater than 10.

mischievous-fvs-r.png

Raw Data

You can use this table to gauge your spelling (I should add the cumulative distribution so you should see what percentile a misspelling places you )

rank query replace levenshtein similarity
1 mischievous 27000.00 0 1.00
2 mischevious 4500.00 2 0.41
3 mischivious 700.00 2 0.50
6 michevious 500.00 3 0.21
7 mischevous 500.00 1 0.64
13 mischiveous 170.00 2 0.50
18 mischieveous 150.00 1 0.67
19 mischivous 150.00 1 0.64
20 michievous 110.00 1 0.64
21 mischeivious 90.00 3 0.39
23 mischeivous 90.00 2 0.50
24 michevous 70.00 2 0.38
25 mischievious 70.00 1 0.67
26 mischeveous 70.00 2 0.41
29 mischeavious 60.00 3 0.39
30 mischiefous 60.00 1 0.60
31 michivious 60.00 3 0.28
32 mischeavous 50.00 2 0.50
33 mishevious 35.00 3 0.28
35 miscevious 35.00 3 0.35
47 mishievous 16.00 1 0.64
48 michievious 16.00 2 0.41
53 misgevious 12.00 4 0.28
54 micheivious 12.00 4 0.20
55 mischvious 12.00 3 0.44
56 mischiveious 12.00 2 0.47
58 mischevios 12.00 3 0.28
83 mischevius 11.15 2 0.35
101 miscevous 8.30 2 0.57
113 micheavous 7.01 3 0.28
133 mischeives 5.48 4 0.28
140 mischeviuos 5.08 3 0.26
153 mischiefious 4.44 2 0.56
176 mischeous 3.60 2 0.47
196 mechivious 3.06 4 0.21
218 miscievious 2.61 2 0.41
223 mechevious 2.52 4 0.15
241 mischieved 2.24 3 0.53
262 myschevious 1.98 3 0.20
263 misjevious 1.96 4 0.28
273 mischeviouse 1.86 3 0.32
277 machivious 1.82 4 0.21
279 mischeiveous 1.80 3 0.39
282 mischives 1.77 3 0.38
321 mischievous? 1.45 1 1.00
324 miscchievous 1.43 1 0.79
333 mischeifous 1.38 3 0.41
334 mistchivious 1.37 3 0.32
351 miscievous 1.27 1 0.64
357 mischieveious 1.24 2 0.63
363 mishcevious 1.21 3 0.26
371 mischievous  1.17 2 1.00
378 mischievous. 1.14 1 1.00
408 micheveous 1.01 3 0.21
422 mischevoius 0.96 2 0.41
430 mistivious 0.94 4 0.28
438 mischievo 0.91 2 0.69
444 misgivious 0.89 4 0.28
483 michivous 0.79 2 0.38
510 mischievous, 0.72 1 1.00
525 mystivious 0.69 5 0.15
528 myschivious 0.69 3 0.26
543 mis chievous 0.66 1 0.67
603 meschivious 0.56 3 0.26
606 mischievoud 0.56 1 0.71
626 mischeviois 0.53 3 0.26
629 micheavious 0.53 4 0.20
635 mishievious 0.52 2 0.41
661 miscivous 0.49 2 0.47
671 meschevious 0.48 3 0.20
676 miss chivous 0.47 3 0.39
734 mischieves 0.42 2 0.53

I'm not a real big Google+ user, but I may consider changing my ways. I really like the "You shared this" feature and it integration with the Google's Author Information in Search Results. When you set everything up properly it leads to "effortless sharing" or at least given the latest change to Google's Social Posts in Search Results. If you want to be an influencer in the digiterati it might be time to reevaluate using Google+. These results are also transitive, even if someone isn't directly in your circle, if they are in one of your friend's circles you can still influence their search results and possibly take up one of the bottom results on the first search page.

A sample of what search results with social posts look like given my circle of friends:

This is a SERP for an article that was "shared by me", because I have a Google+ author profile link on my blog pages. I never had to share this article but Google can identify it as "shared by me" jason_culverhouse_author_profile.png

Here are some friends of mine influencing my search results with very generic search terms that they would generally not rank on the first page of results:

Wayne Yamamoto for the search terms "social proof", at the time I took the screen shot Wayne had not shared this via Google+ but he can still pick up the last result in my SERP.

wayne_yamamoto_social_proof.png

Kevin Leu for the search terms "Silicon Valley", Kevin usually shares everything on Google+ and is able to pick up 2 SERPS on the front page for Silicon Valley when I am logged into search.

kevin_leu_silicon_valley.png

If I am in your circle and you repeat these searches, chance are my friends can influence your search results.

Invest in your Google+ profile, it's like a Facebook feed in every google search.

About this Archive

This page is an archive of recent entries in the google category.

facebook is the previous category.

library is the next category.

Find recent content on the main index or look in the archives to find all content.