Category Archives:

Google Search Algorithm Formula Hypothesis

Sunday, February 3rd, 2008
Posted in Marketing by Joel Gross

From lots of observation, research, reading and experimentation, I believe that I have distilled the very basic formula behind the way that Google displays it’s search results. Google has publicly stated that it’s search results have over one hundred different factors and I’m sure that they do, but what they don’t talk about as much is that only a few factors are of primary importance. The consumer facing internet is made up of two primary parts: infrastructure (links, navigation, site structures) and content (articles, images, video, games, etc.). When you search for a keyphrase on Google, Google tries to determine what you are looking for and bring you back the most relevant and helpful results. Since Google gets billions of searches, they obviously cannot manually bring back what you are looking for. Instead, Google has developed software that indexes most of the Internet and automatically tries to find what you want. The algorithm cannot “see” what is on the page, but it is pretty good at “reading” the text associated with the page (title tags, meta tags, alt image tags, headers, actual paragraphs, link anchor, etc.) Google search also uses many different filters and those sit on top of this formula: Google collects the results and then applies the filters to weed out webpages that have something wrong (spam, malicious software, adult content, etc.)

Many search engine optimizers (SEOs) have extensively discussed the value of links and textual content on your website. I agree with them that these are the two most valuable pieces. My assertion is that Google looks at links * text content to determine where in the search results your site will rank for a keyword. I believe that the search engines multiply links by text content. It is very important to keep in mind that links can be of vastly different value. One link from the nytimes.com home page is worth far more than a 100 links of the home pages of sites similar to mine. I think that links and content are at least 70% of what determines your Google search engine rankings. There are other factors, represented by ?. ?= potential clickthrough analysis, image scanning, text mentions of a webpage, etc.

My basic formula assessment is below”

(Links * Content) 70% * ? 30% = YOUR GOOGLE SEARCH RANKING

Content is made up of all the places you can place text on your web page. Keyword repetition is also important, as long as you don’t trip any spam filters. Definitions of each item are underneath the equation. The numbers are the value, so the higher the number, the more important the particular factor is. If you click View then Page Source, you can see the code behind the page, which is where you will see the title tags, text and everything else.

Content= Title tag *10 + URL *10 + H1 *9 + H2 *7 +P1 *6 + P2 * 5 +P3 * 4 + sidebar title * 6 + sidebar content * 3 + Main Menu * 5 + alt image tag *3 + meta description *2 + meta keywords * 1+ link title * 1

Title Tag- Look at the very top left of your browser window. See JoelX’s Blog in the blue? That’s what a title tag is.
URL- the actual address of the page. In my case, http://www.blog.joelx.com/
H1- The title of the article. In my case “Google Search Algorithm Formula Hypothesis
H2- Secondary title. For instance, if I had subsections of the primary article. I don’t have one in this article.
P1- The first paragraph that you see. Usually the most important content on a page to grab interest.
P2-Second paragraph. People will not read this if you can’t hold them with the first paragraph, so search engines give it less weight.
P3-Final paragraph. Least important as viewed from search engines.
Sidebar title- If I had a content section on the side, the title of it would be this.
Sidebar content- Text in the supplemental information box on the side.
Main Menu-The top menu bar navigation. In my case, JoelX Blog, JoelX sitemap, pseudonyms, etc.
Alt image tag- description of any images you have on your page. Used by blind people and search engines to find out what your picture is of.
Meta Description- Text search engines use when displaying your search results. Never seen by user on your page.
Meta Keywords- Keywords you tell search engines to show your site for when they look at your site. These have fallen out of use by many search engines.
Link title- When you mouse over a link, this gives some details about where the link points to.

So let’s say that you fully optimize a page for the keyphrase “search engines”. You mention the keyword at least once in each of the areas I discussed above and have not done so in a spammy way so no filters have been triggered. Let’s say for discussion that you have a content score of 72 according to my formula estimation. If you have 0 links your page will have 0*72=0 total score and probably will not rank. If you have 2 links (holding the value of links equal), you have a search engine score of 72*2=144. If you have no competition, you will rank first in the search engines in this scenario. However, for any valuable keywords, you will almost always face tons of competition and will usually need lots of high quality links in order to rank in Google’s search results.

How can you determine the value of links in Google’s eyes? Unfortunately, I am under NDA and so cannot discuss that topic. If I told you, I would have to kill you. :)