Search Engine Optimization

Google Quality Rater Secrets


Web site Statistics

 

Google Quality Rater Secrets

Ami Isseroff

Sept 27, 2008

Not long ago Brian Ussery discovered a confidential document describing how Google quality raters are supposed to rate search engine results. The document itself has since been removed from the Web. As there was probably a really good reason for that, I am not going to put it back on the Web, but here is what is apparently a full version of this document in PDF format:

http://www.mauriziopetrone.com/blog/wp-content/uploads/quality-rater-guidelines-2007.pdf 

From comments on this document it seems that many people may be mislead into thinking that these criteria are actually used by the Google search engine to rank all Web sites. In particular,  bloggers seem to have believed that they could figure out from the document precisely how the Google recognizes SPAM Web sites.

There are two problems with the idea. Firstly, the document consists of instructions to humans as to how to rate the Google results. It is not an algorithm or set of algorithms used by the search engine. Google is evidently giving  "grades" to its search engine results for the most part, not to your Web sites. There is no statement in the document as to how the rating results will be used. 

rom its inception as described in The Anatomy of a Large-Scale Hypertextual Web Search Engine and The PageRank Citation Ranking: Bringing Order to the Web, the Google search engine has been a "work in progress" that had the goal of improving the quality of search engine results for users. The instructions and the raters are a way to check the quality of the search engine results, but that doesn't mean that the criteria are necessarily incorporated in this or any other version of the search engine.

The second reason that this document won't tell you how to evade SPAM filters is that spam filters are automatic, whereas spam ratings by raters are done using a set of criteria. It is likely that if a rater happens on a page that is spam or doesn't load or is flagged as pornography, then that particular page will be downgraded or penalized. However, it is not likely that raters will get to every single page on every Web site, or even a large sample of them, or that Google applies the manual ratings of quality, which are separate from SPAM ratings and flags to the actual results.

There are separate categories of ratings for quality of result, which relate to what result was produced for a specific search, for SPAM, for pages flagged for pornography or malware and for pages that cannot be loaded. All of the latter relate to the page itself, rather than the quality of result.

Remember, that the search quality ratings are for the quality of the result relative to a specific query. A page about George Bush may be a perfectly good page, but if it is returned as the result of a query for "art lovers" or "restaurants" then the result is inappropriate, though the page in itself is perfectly good.

 However, in the best of all possible worlds, a page that is webspam or spam or malicious would probably get a rating of Not Relevant. Nobody is searching to get a Trojan horse installed in their computer and nobody is searching for spam. Of course, if you are searching for [Brittney Spears Nude] then all the results returned should be pornographic. That is a good search result for that query.  

It is nonetheless interesting to see how Google judges its own results, because it tells us what they are trying to do, or what they think they are trying to do. 

The raters look at actual user queries from all over the world and at how the search engine responded to those queries. A very interesting aspect of Google's approach is the attempt to understand exactly what the surfer really wanted to see, and to compare it to the results that the search generated. A problem that runs through Google's handbook is that criteria are often somewhat circular. Raters are urged to research a topic by searching for results about the topic on the Web, without consulting outside (non-Web) sources necessarily. A page may be judged to be a good page if it links to "authoritative" pages, but authoritative pages are judged to be authoritative based on search engine results and not based on any knowledge of the subject. The search engine results, in turn, are based on how many pages link to a particular page, making it authoritative (See Web Site Authority and Google PageRank). This is a mechanism for reinforcing and popularizing "conventional wisdom." There is no attempt to objectively measure the quality or reliability of information on a page beyond volume. If there are more words and they are  words relevant to the query, than the page is good. Thus, a page that has 40,000 words about gravity based on Aristotelian theory, with links to the works of Aristotle at highly ranked pages is "better" than a page that explains Newtonian and Einsteinian gravitational theories in 500 words with a few equations and no references. 

Ranking Criteria - The raters are presented with material from a database that shows the query, the location from which the query originated and one URL that was retrieved by the search engine. We are not told in the document if it was the top URL retrieved or if Google evaluates more than one URL for a given query to see if its ranking system is valid.

The following are some of the basic criteria in the document (based in part on The Google Quality Raters Handbook

 

Types of Queries:

Google classifies queries as:

Informational - Searching for information, such as "Magna Carta"

Transactional - User wants to buy something (eg: at ebay or Amazon).

Navigational - User is looking for a specific Web site URL.

It can't possibly be clear, in every case, whether a query is "transactional" or "informational." A search for "War and Peace" could be looking for a summary of the novel or looking to buy the novel or to read it online. But Google raters are supposed to understand what the surfer wanted.

Broad vs Specific Google raters also classify searches as broad or specific, though no criteria are given for these categories.

The handbook gives these examples:

digital camera - Looking to purchase a digital camera.

Canon SD 550 - Looking to purchase this specific camera

Of course, the surfer may not be looking to purchase anything, but rather searching for information and specifications, sales figures or other data. But Google's assumptions are evidently based on the intentions of the majority of users, or what they think characterizes the majority of users.

Search Quality Intangibles

In general, results should match the expectations of the surfer and the type of query, as interpreted by the raters. A broad search should return a broad result and a narrow search should return a narrow result.

Timeliness - Search results are time dependent. A query about George Bush in 1991 should have returned information about George H.W. Bush (the father) while a query in 2008 should have returned more information about the son. 

Location - Results should be relevant to the location of the user. If a user searches for "football" in the USA, they are looking for information about the game played with the oblong  ball and the quarterbacks etc. The same search in the United Kingdom should generate information about the game Americans call "soccer."

Amount of Information Available - If a lot of information is available about a topic, Google tells its raters, then a page that has just one link and little text should not rank highly. That tells us that in principle at least, larger pages are intended to get higher rankings, and there is no such thing as "optimum page size." (see Optimum Page Size Superstition)

Google does not try to rate pages as to correctness or reliability of information. That is peculiar, because it means that it never checks whether pages or sites that are supposed to have the highest Authority according to its Google PageRank algorithm are indeed providing correct and authoritative information. The highest ranking page about the Canon camera could be a totally incorrect advertising blob or a hatchet job done by competition. The top page retrieved for George Bush might be an encomium written by election propagandists or a hatchet job done by the opposition. The top positioned page for keyword Jew might be (and often is) a racist screed composed of paranoid anti-Semitic inventions. 

Google Quality Rating Scale

Google uses a 5 level quality rating scale for ratable search results:

Vital: (1.5) A score that is reserved only for navigational queries where there is a clear dominant Web page. It is misleading to say that "vital is the highest score," because it doesn't apply to queries that are not navigational. A Vital rating is given if the user searched for the name of a firm or a person, and the query returned the Web site of that firm or person.    For example, the page returned is the official Web page of a firm or entity that was the subject of the query.   When searching for 'ibm', the vital result would be www.ibm.com. But Google makes unwarranted assumptions about what the user wants. Suppose the user is looking to buy storm windows, and types [windows] in Google? They will get the home page of Microsoft Windows as their top result. For [apple]they will not get a page about fruit, but rather, the home page of Apple computers. Google queries are not case sensitive.

Useful: (1.5) A useful rating should be assigned to results that "answer the query just right; they are neither too broad nor too specific." For example: A search for meningitis that returns:  http://www.webmd.com/hw/infection/aa34586.asp

For informational queries, this is the highest possible rating.

Relevant: (3) The results are often "less comprehensive, come from a less authoritative source, or cover only one important aspect of the query." For example, a review of laptop computers that only discusses five computers and not all computers within its class. Note that "less authoritative source" can be judged in many ways. If the criterion is Web page attributes such as number of links, Google is using its own algorithm to validate its algorithm.

Relevant pages, according to the handbook, include a page with a brief article on the topic of the query or a less important subpage on the correct site. If a query “asks” for a list, according to the guidelines, then a single item is Relevant. For example, if the query is [ fudge recipes ], a single fudge recipe is Relevant. Thus Google, but that may be a matter of opinion, depending on what the user intended and the quality of the page.

A rating of Relevant is also used for a homepage that would have been Vital if there had not been a more dominant interpretation for the query.

Not Relevant: (4) Pages that are not helpful to the query but are somewhat still connected to the original query. Classifications of a not relevant page would be "outdated, too narrowly regional, too specific, too broad" etc. One of the examples given is a search for the 'BBC' that returns a specific article from BBC; it is too specific and is not relevant to the query at hand.

A rating of Not Relevant is also assigned to a page if it has a link to good results on the same site or another site, but is not a good result itself. It may be an unimportant or useless sub-page on the correct site or it may be only a link page.

Off-Topic: (5) This is the lowest rating a page can receive for a query. If the returned page is completely not relevant to the query, it would be given a rating of "off topic." An example given is a query on 'hot dogs' that returns a page about doghouses.

According to the handbook, A rating of Off-Topic also applies when the result ignores an important modifier or element of the query. For example, for the query [ universities in India ],  An article about universities in Europe is Off-Topic. But this is a frequent fault of results returned by Google for complex queries.

Worse than bad ratings

Pages that cannot be rated or are spam or undesirable content do not fall in the above scale at all. They have separate scales.

Results That Can't Be Rated:

Didn’t Load: For pages that return a 404 error, page not found, product not found, server time out, 403 forbidden, login required, and so on.

Foreign Language: This is given to a page that is in a "foreign language" to the "target language" of the query. English is never a foreign language.

Unratable: When the rater cannot rate it for any other reason.

Flagged pages:

Flags are for pages that require immediate attention. Google lists only two flags:

Pornographic content

Malicious code on pages

Again in theory, a page that is ranked "vital" might have pornographic content. There are no flags for racist content or dangerous pages that contain instructions on how to commit the perfect crime, complete instructions on how to build an atomic bomb and where to get the materials etc. It is much more probable that flags are applied and result in general penalizing or removal of a page, because they relate to the entire page, rather than to results for a specific query. Malicious code flags evidently result in a warning that "This page may be dangerous to your computer."

Actual Positioning and Search quality rating

As far as we can tell, raters are never told what the positioning of the query result was. They are shown the query and one result. They are not told if the user in question actually clicked on that result or not. In fact, we have no idea how Google's system decides to select a particular query result for rating, or who the results of the rating may be used to change the search algorithm.

Influence of Ratings on Search Results

If Google indeed uses these quality ratings to influence their results, then they are violating their own declared code. Google queries often list racist or obnoxious content, as for example, for keyword "Jew." Google puts a notice with these results that explains that they are also disturbed by the results, but they cannot suppress them because that would disturb their algorithm. Since Google can and does suppress pages flagged as spam or malware, their notice is counterfactual evidently.

 

Spam

There is a large section on SPAM that appears in some of the versions of the handbook on the Web, while only an abbreviated version seems to be shown in other versions. The labels for SPAM are:

Not Spam: The not spam rating is given to a page that "has not been designed using deceitful web design techniques."

Maybe Spam: This label is given when you feel the page is "spammy," but you are not 100% convinced of that.

Spam: Given to pages you feel are violating Google's webmaster guidelines.

Again, the SPAM ratings that they are evidently orthogonal to the quality ratings. One would think that any page that gets a high result quality rating had better not be spam, but evidently that is not the case. In theory, a page ranked "vital" could be labeled Spam!

Google warns that it is better to err on the side of leniency and not to label a page as spam, and it also explains that there is a mechanism for adjudicating differences of opinion between raters. This suggests strongly that SPAM ratings and similar flags are really applied to results if they are found to apply to a page.

Google Criteria for SPAM

Google recognizes the following sorts of SPAM sites for manual rating purposes. Remember that these are not necessarily detected by automatic filters. In general, these type of pages implement various techniques of Black Hat SEO:

PPC - Pay per click - the page is set up so that it is all or mostly Pay per click advertisements, or consists of "scraped content" (content taken, usually automatically, from other Web sites) plus PPC advertisements. Google warns specifically, for example, about pages that simply copy Wikipedia and add advertisements. These are considered spam.  "The important thing to remember is that if the scraped (copied) content on the page is removed and all that remains is ads, it is Spam."

Parked Domains - An expired domain is purchased by a spammer and filled with irrelevant junk links. Since we can find many of these sites displayed prominently in search engine results, it is obvious that the criterion is only applied if raters find the site.

Thin Affiliates - A thin affiliate is site that is set up only as a front to market products of another e-business. You cannot purchase the products there. Some bloggers have noted that the criteria used by Google are too strict and may screen out real merchants, because the criteria include points like "a way to track FedEx orders," "a “wish list” link, or a link to postpone purchase of an item until later." Someone got carried away here. Obviously, a real merchant just needs to provide a place to buy the product. A site that offers price comparisons or other useful information is not considered a "thin affiliate."

Hidden Text and Hidden Links - Text and links are the same color as background, allowing addition of keywords. 

Java Script Redirects - a form of cloaking. The search engine spider sees one page that has information, but the java script redirects the user to a different page that is spam.

Keyword Stuffing - This can be done manually or automatically. The idea is to load the page artificially with keywords to attract the search engine spider. Keywords may or may not be related to the content of the page. Some pages are generated "on the fly" in response to queries, so that in future, they will be there when that query is entered. Keywords can be stuffed in any part of the page including the URL.

100% Frame - A frame page takes up 100% of the browser view, so that users see only that page, but the spider sees the frame page and another page that is linked to it and blocked from view. The second page contains real information, but the user does not see that. 

Sneaky redirect - It is not clear how this is really different from Javascript redirect in principle. In a sneaky redirect, we are told that the page redirects to one or more other domains on a rotational basis. Google also allows for legitimate redirection. According to Google's directions, a redirect is not "sneaky" if both domains have the same ownership. This cannot be correct if interpreted literally, since it would allow you to have, for example,  a page that is supposed to be about political issues, and redirects to another domain that you own, which is a porno site. 

Webspam that Google Misses

It seems from the above that there are a number of kinds of SPAM and Black Hat SEO that Google misses. For example, there are schemes that allow automatic redirection from a legitimate page to a page that is the Web site of an affiliated merchant. That sort of thing is not covered explicitly, but it is obviously SPAM. 

 

Who will rate the raters?

There is no real mechanism, presumably for adjudicating differences of opinion about quality, and no real mechanism for checking objectively if a page is spam. There is a resolution mechanism for spam issues, but it is apparently based on mechanical criteria. For example, Google tells raters, they should check the page in different browsers and check source code to find out why their rating differs from that of other raters.

A page about Islam for example can be rated as "not relevant" or SPAM by Islamophobic raters, and page about Zionism that originates in Israel might be labeled as "not relevant" by raters in Saudi Arabia, or they might simply categorize it as "didn't load" or "foreign language" or "unratable." Into the trash bin it goes!  It is not clear what effect, if any, these ratings will have on actual search results, or who checks these issues. 

 

What it means for Web site owners

Remember what it all means. Insofar as the quality ratings are concerned,  this is how Google checks its own results using human raters. It is not how the Google algorithm works. Regarding the various flags and SPAM ratings, it is probably that they are applied to pages that the raters find, but it is not likely that the raters will get to more than a few percent of the possible pages. Of course, they get to rate only a tiny percent of possible queries.

Larger Issues

When a person searches for "apple" or "windows" they are liable to be directed to the Web site of Apple or Microsoft computer company. Consider the effect that this has, and will have on the language and culture of the world. Consider the power that we have signed away to large corporations to shape our culture. And, at the narrowest level, consider the dilemma of the search engine raters. Suppose someone has a brand of computer called Sex. Should the query for the popular keyword [Sex] show the home page of this company or should it show sites related to sex? Why is that different from "apple"?

 

Ami Isseroff

Notice: Copyright

All materials are copyright 2008 by Ami Isseroff. All rights reserved. These pages may not be reproduced in any form in electronic or printed media without express written permission from the author.

SEO

SEO Basics

The SEO Book

SEO Articles

SEO Blog

SEO Glossary

SEO Links

More Links

Love Poems

MidEastWeb: Middle East

Zionism

 

SEO - Web Site Search Engine Optimization Contact: Webmaster(at)Yu-hu.com
site map

Google Quality Raters Secrets