More on Technorati Problems

I think I may finally have found what the problem with Technorati spidering my site is. I was planning to send them another message asking why they haven’t yet answered my first post when I noticed the following bit of information from http://technorati.com/help/publishers.html —

How can I produce better content for indexing?

Help our spiders find your content in its entirety by outputting valid markup for your web pages and feeds.

The W3C Markup Validation Service will help you identify and correct markup errors on your site. Modern spiders can work around some errors but it is best to provide as little work as possible to index your site and display all of your content the way it is meant to be displayed.

Technorati also indexes your feed to retrieve additional information and discern document structure. Feed Validator will help you verify the markup of your site’s syndication feeds and identify possible points of improvement.

I decided to check out the both the Feed Validator and the W3C Markup Validation Service pages to see whether or not the problem may have been coming from my end. As it turns out, it may just be.

From the W3C Markup Validation Service I got this result for Gnorb.NET’s home page. Nasty, nasty stuff.

Needless to say, but “ouch!” Talk about screwing up a perfectly good Web page. The results from Feed Validator weren’t any better. Heck, the tool couldn’t even recognize my RSS feed as a feed! Talk about something that’ll ruin your chances!

Although that first issue is mostly my fault (looks like there are a lot of Amazon partner issues I can’t really do much about), the second looks to be a problem with WordPress directly. Maybe it’s time I find an Atom feed plugin that’ll produce nice, clean code.

Looks like I have a lot of work this weekend. Between that, SEF’ing the URLs, creating sub-domains, and moving all content from all other system blogs to this one and only blog (big task, but worth it, I think, especially since my other stuff isn’t really being spidered), plus fixing phpBB — yeah, a lot of stuff to get done.

Share your thoughts