Improving search engine placement (without breaking the rules)

Search engine optimisation (SEO) has a bad reputation. That’s tough for SEOs but unfortunately it’s a side-effect of black hat SEO techniques.

I haven’t knowingly used any SEO techniques as this blog is really just a hobby of mine. I enjoy writing for it, find it a good place to store my notes for future reference (hence why sometimes there is detail here that would not be useful to anyone else!) and like the feedback I get when someone else finds my content useful. I’m pleased with my site’s ranking (considering I’ve done very little to boost it, other than to write lots of posts) and although the advertising revenue will not let me give up my day job yet, it does at least cover the hosting costs. Even so, I’ve been intrigued when reading SEO articles in .net magazine (it seems that SEO is not a black art – just common sense really) and recently I’ve been checking out a few tools and methods which should help anyone to increase the placement of their site (it seems that I’ve been using much of this advice purely by chance):

There are also a few more links that might be useful in some of my previous posts:

Does the world really need another search engine?

Windows Live
Two of London’s free newspapers for commuters (Metro and The London Paper) are featuring wrap-around ads for Microsoft’s Windows Live Live Search today. The front page is almost entirely blank, save for a search box which asks “Does the world really need another search engine?”:

Does the world really need another search engine?

As Google and Yahoo! have once again extended their lead on Microsoft in the search engine rankings and Google has become the most visited website in the UK, I have to wonder if Microsoft should be asking themselves the same question. It’s all very well emphasising the extra features that Live Search offers – like controlling the size of the results on a single page, hovering over images for more detail, providing bird’s eye views to accompany maps and directions (all very well for pilots and birds, but not so useful on the ground) and personalising results; however, of all organisations, Microsoft should be well aware that it’s not necessarily the product with the best feature set that gains the most market share. Having said that, Google came from nowhere a few years back – and who uses the pioneering Lycos, Excite and Altavista search engines today?

Live Search is certainly impressive and Microsoft’s ads state that:

“To us, search is in its infancy. This is just the start.”

Maybe Live Search will push Google into doing some work to integrate their disparate Web 2.0 applications (many of which seem to be in a perpetual beta state); in the meantime, the message seemed to be lost as I observed commuters at Canary Wharf – one of London’s major commercial centres – simply flicking past the four full page ads to get to the news.

Give Live Search a try at live.com.

Helping spiders to crawl around my bit of the web

A few months back, I blogged that my Google PageRank had fallen through the floor on certain pages. I was also concerned that the Google index only contained about half the content on my website.

I don’t engage in search engine optimisation, but I have found out a few things which have made a huge difference to both the quality of the site and Google’s ability to find my content – mostly from Gina Trapani’s excellent help the GoogleBot understand your website article (it’s also worth checking out 9 things you can do to make your web site better).

So, what did I do? Well, I read that the GoogleBot can’t read JavaScript – that would account for archive pages that were not being picked up – so I removed the drop-down archive list; however it didn’t seem to make much difference so I’ll be reinstating it again soon. What does seem likely is that my archive page links were appearing too far down the page code (as another theory is that the GoogleBot only follows the first 100 links on a page – actually, that’s one of the Google webmaster guidelines) and a quick look with the Smart IT Consulting GoogleBot spoofer confirmed that the archive links do indeed appear way down the code. I need to rework the site sometime (better CSS and an improved site layout… though goodness knows when I’ll get the time) and when I do, I’ll set the archive links higher up the code (I’ll need to if I want the site to degrade nicely). By far and away the most significant change I made to the site was joining the Google Sitemaps program. I now use the unlimited version of the XML Sitemap Generator to produce a new sitemap each time I write a new post and Google is finding every one of my pages (there is also a free online sitemap generator available).

Today, entering site:markwilson.co.uk as a Google search brings back 626 results (up from around 250 in March) and the webstats also show an increase in the number of site visitors, so forget search engine optimisation – just give the spiders a little bit of help to crawl your site.

Why have some of my PageRanks dropped?

It’s well known that the Google index is based on the PageRank system, which can be viewed using the Google Toolbar.

Google page rank

But something strange has happened on this blog – the main blog entry page has a PageRank of 5, the parent website has a PageRank of 4, but the PageRanks for most of the child pages have dropped to zero.

Now I know that posts have been a bit thin on the ground this month (I’ve been busy at work, as well as working on another website), but I can’t understand why the rankings have dropped. I found this when I was using the site search feature to find something that I knew I’d written, but it didn’t come up. Entering site:markwilson.co.uk as a Google search brings back 258 results, but this blog has nearly 500 entries, plus archive pages and the parent website – where have all the others gone? Some recent entries, like the one on Tesco’s VoIP Service, have a PageRank of zero but still come back on a search (at the time of writing, searching for Tesco VOIP brings back my blog as the third listed entry). Others just don’t appear in search results at all. Meanwhile some old posts have PageRanks of 2 or 3.

I know (from my website statistics) that Googlebot is still dropping by every now and again. So far this month it accounts for 3319 hits from at least 207 visits – I just can’t figure out why so many pages have a PageRank of zero (which seems to be a penalty rank, rather than “not ranked yet” marking).

I don’t deliberately try to manipulate my search rankings, but steady posting of content has seem my PageRank rise to a reasonable level. I just don’t understand why my second-level pages are not appearing in the index. The only thing I can think of is that it’s something to do with my new markwilson.it domain, which is linked from this blog, and which redirects back to a page at markwilson.co.uk (but that page has no link to the blog at the time of writing).

I’ve just checked the syntax of my robots.txt file (and corrected some errors, but they’ve been there for months if not years). I’ve also added rel="nofollow" to any links to the markwilson.it domain. Now, I guess I’ll just have resubmit my URL to Google and see what happens…