Jun
19Handling Genuine 404 Errors
Tags : SEO, To Do Posted in: Search Engine Optimisation
Since I started tracking missing pages (only recently) I have found that with upgrades to the site, when a lot of duplicate data was removed, I now have lots of genuine 404 errors.
Sure I may have page errors which I haven’t realised yet, but I certainly have pages which don’t corresond to records in the database.
At the moment I log this error and re-direct to the home page.
Obviously if your a user and you click on a link from google your not going to find what your looking for, whether you then use the site search to re-find what your looking for, prossible not.
In my SEO book is says “Page Cannot be found” suicide.
Although it just talks about fixing your site and tools to help find broken urls.
What can i do to help the user…
1. Try and gain the search phrase used if they used a search engine to find the page, then do a site search on that phrase.
2. If theres no search data, try and convert the page name into a search phrase. At the moment all page names have hyphens in them, so I guess I could replace those with spaces. But doing a simple WHERE clause on that, wouldn’t bring back ANY / MANY results, unless it was only one word. I’m guessing that there will be a MYSQL function or feature which will allow me to break up the words and do a search. But do i search on keywords or product name. Hmmm
Thoughts?
Ideaspad - An Award winning information manager for home and professional use
by JM


Did a quick look on Google and found this: http://www.easyclick404.biz/.
Not sure if it’s the kind of thing you’re looking for though.
No, I’m trying to turn unhandled requests (requests for pages I can’t match in the database) into database searches.
I’ve posted a message on ee to see if someone can suggest.
I need to query a few different fields on each word of the file name, which there might be maybe 4 words. Then I want a score to order the results.
Like those 10 most popular programs on the program pages.
But its not the same query
I was getting a lot of missing page errors.
With the last upgrades I introduced the PadID into the pagename e.g.
ideaspad-234234.shtml
Somehow I’ve screwed up at and some old page have this format.
It could be that I’ve deleted old records and new ones have filled the gap.
The program page now does 3 queries.
ideaspad-234234.shtml
ideaspad-.shtml
And the drum roll, this may not seem so amazing but it has stopped a lot of missing pages…
ideaspad*
So should be interesting to see tomorrows log.
On a similar note, I’m also missing some program screen dumps, I’ve done something similar here, however I haven’t quite cracked it yet. So I’m not showing a “no image found” picture with green text to make it easier to spot.
Anyways, I’m not actually sure if I need the page anymore. I guess it depends on how many log entries I have tomorrow.
My new enhancements seem to be working. I’m now only getting logged errors from web crawlers who are picking up records which I deleted.
So this is as expected.
I’m not sure whether this accounts for the extra hits that we are getting, but theres quite a few crawler logs for these pages.
Also, I’m not sure whether I mentioned anywhere else but, I’ve added a new ‘no image available’ image. Its green instead of blue text. This means we should have an image but we don’t
I haven’t yet discovered why.
But I will be working on this soon.
Also, I’ve got someone whose doing some other house keeping for me (on another one of my site) who I intend to have a look round softtester for any oddness or problems.
e.g. I know the release date on the search page is wrong.
Maybe if he finishes soon, he’ll spot whats wrong with the images.
I haven’t got a quick way to find whether is a file name problem, as I have to get a list of all the files in the images folder and compare that with a missing file.
I guess I could write some sort of basic search script in PHP like the one I have for missing pages.
by JM