Google search spiders keep crawling the net and update the search index. How often your website gets crawled and how fast a new page gets indexed in Google depends various factors including the frequency of update on your site. Generally, Google will treat the earliest copy of an article or webpage as the original.
Let us assume that there are two websites: a smaller website A that gets crawled once every 24 hours and a bigger website B which gets crawled every hour. The smaller website A published an article at 11 am. Google spiders had paid a visit a couple of hours ago and are set to visit it only after 24 hours. Meanwhile, the bigger website B copies the article and publishes it on its website by changing the time stamp (publishing time) to 10 am. When Google spiders visit in the next one hour, the article gets indexed as an original article by the bigger site B. On the next day, Google spiders discover the same article on website A. Despite being the original content, Google marks it as a duplicate content. What’s the remedy? How to avoid it?
Google’s Matt Cutts regularly answers the following question:
“Google crawls site A every hour and site B once in a day. Site B writes an article, site A copies it changing time stamp. Site A gets crawled first by Googlebot. Whose content is original in Google’s eyes and rank highly? If it’s A, then how does that do justice to site B?”
According to Matt Cutts, there are two simple ways of making sure that Google discovers your article first or, at least, knows that your’s was the original one and appeared before the other.
1. Tweet: Google crawls tweets at blazing speed. Haven’t you seen those real-time tweets in search results. A tweet will make sure that your article is discovered and indexed faster.
2. PubSubHubbub: Use PubSubHubbub protocol to ping the search engines and let them discover your new content faster.
If another website is indexed as th owner of your article, you can send a request to Google under the Digital Millenium Copyright Act (DMCA) at http://www.google.com/dmca.html. Google will process the request and rectify the error. DMCA allows you to claim copyright of your content and penalize the content scrapper. If the culprit website is scrapping content from several other websites, you can report it as a spam website.