Do You Have Duplicate Content Issues Across Domain? Google Will Now Al

Today, Google webmaster tools has launched a new message alert to let site owners know when a particular URL doesn’t appear because Google sees it as duplicate of a URL on a different domain. In the blog post announcing the feature and in an in-depth help topic, they provide details on how they identify duplicate clusters of content and choose a “canonical” version of that cluster to display in search results.

“When we discover a group of pages with duplicate content, Google uses algorithms to select one representative URL for that content. A group of pages may contain URLs from the same site or from different sites.”

They note that when they choose a representative URL from a different domain, they call this “cross-domain URL selection”.

In cases where multiple URLs contain the same content (for instance, due to infrastructure configuration, optional parameters, syndication, or internationalization), many options exist for site owners to indicate to Google which version is canonical.

However, in some cases, the site owner doesn’t use these options to specify a preferred version or Google may select a different version than the site owner specifies.

This new feature alerts site owners  when their “algorithms select an external URL instead of one from their website”. They say common reasons for this include:

  • Site owner-specified – if you’ve moved your domain or have implemented the rel=canonical attribute to indicate that a page on another domain is canonical, then this alert is simply confirmation that Google is indexing as you’ve specified.
  • Regional sites – if you have the same content on multiple regional sites (for instance, the same English content on a .com (for US), a .co.uk, and a .com.au), Google may cluster pages with identical content across sites and use relevance signals to determine which to display per query.
  • Incorrect canonicalization – in this case, a page may inadvertently use the rel=canonical attribute to specify a page on a different domain as canonical.
  • Misconfigured server – a hosting misconfiguration (this in particular happens sometimes with shared hosting) may cause a two different domains to display the same content)
  • Hacked site – sites are sometimes hacked to point to other domains.
  • Scraped content – the blog says that “in rare situations”, Google may select a URL from a site that has scraped your content.
This alert is available within the message center, so you’ll only see it if your site has this issue and Google is currently only reporting on the URLs from the Top Pages report. This is feature is great insight for site owners who otherwise would have no idea why a particular page doesn’t appear in search results. I’ll be posting a follow up shortly with more details on some of these scenarios and what you can do if you get an alert.
you know what is puzzling here is what is the purpose off streaming media if you are penalized for duplicate content. the news media abc, cbs, and nbc, cnn, and others, all report the same news, is that considered duplicate content, no, why? because one who views cbs evening news, may not partake of cnn.com late night, same goes for information on the web, one viewing data or information on one site, may never visit the site with the original post, but the news is distributed. 
why is google now trying to limit what information is being provided and across what platforms. canonical content, this smells to me as a method of monetizing for google. the one who pays for distribution, no matter what the acutal method of authorship is, they are looking to line their pockets.
When a GIG is not enough --> Terabyte Dolphin Technical Support - Server Management and Support
Quote · 2 Dec 2011
 
 
Below is the legacy version of the Boonex site, maintained for Dolphin.Pro 7.x support.
The new Dolphin solution is powered by UNA Community Management System.