April 10, 2006
Google hijacking and a possible workaround
It looks a couple pages on my site got 302-hijacked when I wasn't paying attention... Once upon a time, my site was one of the top Google results for "sqlite and php". Now the pages in question are no longer in Google's index at all, at least not under the original URLs. That is to say, if you do a search for "sqlite and php" the pages are still there (8th result), but the URL is now some site in Taiwan. Follow the link, and you do get all of my content, but passed through some proxy that adds a header and some other crap.
So whether it was an intentional hijacking or not (I'm thinking it was, as the page had a fairly high page rank), I won't sit idly by. I'm going to try two counter-hijackings in an effort to restore my search engine listing.
The first thing I've done is an attempt to unhijack the original content. I've changed all links from http://code.jenseng.com/db/ to http://code.jenseng.com/302fix/db/. This is simply a script that issues a temporary redirect back to the original URL. In theory, Google will index the new URL, see that it redirects to the old one, and add it to the index again under this new URL.
That might have been sufficient if I'd caught this early on. But seeing as the hijacking site simply serves up duplicate content without redirecting to my site, stronger measures might be necessary. At this point Google might ignore my hijacked pages, because it thinks they have moved permanently to the hijacking site. So the second thing I've done is create a script that issues a temporary redirect to their site. In theory, this will allow me to effectively un-hijack my pages, as Google will see my newer link, index it, and soon the URL in the search results will be mine. Then all that is needed is to change the script to do a permanant redirect back to the original pages.
So we'll see how well it works out. If it does, it would be one more way to combat page-hijacking until Google fixes this problem.