Helping spiders to crawl around my bit of the web

A few months back, I blogged that my Google PageRank had fallen through the floor on certain pages. I was also concerned that the Google index only contained about half the content on my website.

I don’t engage in search engine optimisation, but I have found out a few things which have made a huge difference to both the quality of the site and Google’s ability to find my content – mostly from Gina Trapani’s excellent help the GoogleBot understand your website article (it’s also worth checking out 9 things you can do to make your web site better).

So, what did I do? Well, I read that the GoogleBot can’t read JavaScript – that would account for archive pages that were not being picked up – so I removed the drop-down archive list; however it didn’t seem to make much difference so I’ll be reinstating it again soon. What does seem likely is that my archive page links were appearing too far down the page code (as another theory is that the GoogleBot only follows the first 100 links on a page – actually, that’s one of the Google webmaster guidelines) and a quick look with the Smart IT Consulting GoogleBot spoofer confirmed that the archive links do indeed appear way down the code. I need to rework the site sometime (better CSS and an improved site layout… though goodness knows when I’ll get the time) and when I do, I’ll set the archive links higher up the code (I’ll need to if I want the site to degrade nicely). By far and away the most significant change I made to the site was joining the Google Sitemaps program. I now use the unlimited version of the XML Sitemap Generator to produce a new sitemap each time I write a new post and Google is finding every one of my pages (there is also a free online sitemap generator available).

Today, entering site:markwilson.co.uk as a Google search brings back 626 results (up from around 250 in March) and the webstats also show an increase in the number of site visitors, so forget search engine optimisation – just give the spiders a little bit of help to crawl your site.

markwilson.it

get-info -class technology | write-output > /dev/web

Helping spiders to crawl around my bit of the web

Related

Leave a ReplyCancel reply