Google Search is a powerful tool that can help you find useful information on the open web. However, not all web pages are created with good intent. Many of them are explicitly created to deceive people, and Google is fighting against this type of spam every day.
To ensure safety and protect the search experience against disruptive content and malicious behaviors, Google has invested in many innovations in 2020. One of these innovations is a spam-fighting AI that is incredibly effective at catching both known and new spam trends. For example, the AI has helped reduce sites with auto-generated and scraped content by more than 80% compared to a couple of years ago.
Hacked spam was still rampant in 2020 as the number of vulnerable web sites remained quite large, although we have improved our detection capability by more than 50% and removed most of the hacked spam from search results.
This is a problem that we cannot solve alone. Even if we could detect and protect against all spam, the hackers would not cease exploiting loopholes until they’re all closed. Website owners can protect their sites by practicing good security hygiene: it is easier to prevent a site from getting hacked than to recover from a hack. Google offers resources to help you understand the most common ways websites get hacked and how to use Search Console to check whether your site got hacked. Please do take a look and let's keep the web safer together!
With last year's major events, including a global pandemic, we devoted significant effort to extending protection to the billions of searches we received on such important topics. If you're looking for a COVID testing site, you shouldn't have to worry about landing on gibberish spam that may redirect you to phishing sites.
Besides eliminating spam content, we worked with several other Search teams to make sure you receive the most up-to-date and highest quality information when and where it matters the most. There's a lot that happens behind the scenes before we deliver a set of search results on Google. Every day, we're discovering, crawling, and indexing billions of web pages. Among those pages is a lot of spam—every day, we discover 40 billion spammy pages.
Here's how we work to keep that spam from getting in the way of your search for helpful, useful information:
This diagram conceptualizes how we defend against spam.
First, we have systems that can detect spam when we crawl pages or other content. Crawling is when our automatic systems visit content and consider it for inclusion in the index we use to provide search results. Some content detected as spam isn't added to the index.
These systems also work for content we discover through sitemaps and Search Console. For example, Search Console has a Request Indexing feature so creators can let us know about new pages that should be added quickly. We observed spammers hacking into vulnerable sites, pretending to be the owners of these sites, verifying themselves in the Search Console and using the tool to ask Google to crawl and index the many spammy pages they created. Using AI, we were able to pinpoint suspicious verifications and prevented spam URLs from getting into our index this way.
Next, we have systems that analyze the content that is included in our index. When you issue a search, they work to double-check if the content that matches might be spam. If so, that content won’t appear in the top search results. We also use this information to better improve our systems to prevent such spam from being included in the index at all.
The result is that very little spam actually makes it into the top results anyone sees for a search, thanks to our automated systems that are aided by AI. We estimated that these automated systems help keep more than 99% of visits from Search completely without spam. As for the tiny percentage left, our teams take manual action and use the learnings from that to further improve our automated systems.
Protecting you beyond spam
Beyond spam, we expanded our effort in 2020 to protect you against other types of abuse. Many of these can cause significant financial and personal harm.
In 2020, Google made significant progress in improving its coverage and protecting more users against online scams and fraud. Online scams have many shapes and they can negatively affect you in more ways than traditional webspam. For example, many scammers pretend to be offering customer support phone numbers to popular services and products, only to trick users who call in into paying them via bank transfers or gift cards. Commonly known as "customer support scam" or "tech support scam", this type of scam has been reported by hundreds of thousands of users where users may lose hundreds of dollars to scammers in each case.
An example of a customer support scam on search results is shown below:
Since 2018, Google's systems have been able to detect and protect against potentially scammy websites. Scammers create low-quality websites with keyword stuffing, imitated logos, and a phone number in an attempt to appear in search results. However, Google's algorithmic solutions have made it very unlikely for these scams to show up. This is just one of the many protections that Google has launched in order to ensure the quality of search results and user safety. You can also protect yourself by staying informed and learning about scams.
In addition to protecting against scams, advances in AI have also helped Google understand the content of websites better. For example, Google has improved the way it ranks product reviews, informational, and shopping sites. This means that when you search for products on Google, you are more likely to get accurate and useful information that can help you make a purchase decision.
Though we have made significant progress in our fight against spam, spammers are still very motivated to develop new techniques to evade our detection. We are always working to improve and protect people from new types of abuse, and external reports can be very helpful. If you have any recent experiences with Search where you feel misled, scammed, or spammed, please share feedback using the spam report, along with the query and any other information that might be useful.