Love the PC – hate the technical support

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

I love my IBM ThinkPad T40 – it’s easily the most solidly built of my three notebook PCs and whilst my everyday PC is a much more highly specified Fujitsu-Siemens Lifebook S7010D the ThinkPad is my machine of choice.

Unfortunately, a few weeks back, I accidentally deleted the hidden protected area (HPA) on my ThinkPad (also known as the Access IBM pre-desktop area).

My first experience of IBM’s technical support was great – once they had confirmed that the machine was in warranty, they were happy to send me recovery CDs free of charge but since then things have not been good. Even my less-than-satisfactory experiences of Dell and CA support via e-mail from India was better than my current experience of IBM. All I could get from IBM hardware support was a statement that the restore CD should bring back the pre-desktop area (it doesn’t) and a referral to the software support line. There lies the problem (via an e-mail from an obscure e-mail address that fell foul of Outlook’s junk e-mail filters) – IBM provides free hardware support during the computer’s warranty period and free software support for the first 30 days after the purchase of the computer, after which the software support becomes chargeable. Fair enough for operating system support, but for an IBM technology accessed via a hardware function key? My last e-mail asked them to clarify whether they consider a partition provided on the hard disk to be hardware or software. No response (although I suspect I know the answer to that one).

Surely it’s not unusual for a hard disk to be replaced in an IBM PC and for the Access IBM pre-desktop area to be restored? Grrr.

Turn off your PC at night and save the planet (well, at least the English countryside and some cash)

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

I was interested to hear the following information in a presentation by Microsoft UK’s James O’Neill this afternoon:

  • A single personal computer PC draws 125W of power each hour (but 5W when in sleep mode).
  • Running that PC for 50 hours a week (instead of 24×7) saves 120W (0.12KW) x 6160 hours = 740 KWh per year.
  • Generating 740KWh of electricity represents 1/3 tonne of carbon dioxide (CO2) per PC per year.

Maybe if we all turned off our PCs at night we wouldn’t need to fill the English countryside with wind turbines

Oh yes – in case you don’t care about global warming, 740KWh of electricity costs around £45 a year [source: my domestic electricity bill from Powergen].

Running another operating system on a Mac

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

Since Apple switched to using Intel processors for certain Macintosh models, I’ve been excited by the possibility of running Windows on a Mac. Some say its sacrilege. I say it’s sensible. I love the Apple hardware, but am not a fan of the software, which (in my opinion) is proprietary and expensive. I also know Windows very well (including how to keep it secure). Ideally, I’d have a Mac Mini, dual-booting a major Linux distribution and Windows XP.

There have been various reports of people who have managed to write an EFI boot loader for Windows on a “MacIntel”, as well as reports of those who have turned their systems into an unbootable and unsupported heap of PC components in the process; but Apple provided me with a nice birthday present earlier this month by announcing Boot Camp – software to allow dual-booting of OS X and Windows XP, including driver support.

I’m not quite ready to switch yet – Boot Camp is still a beta and the final release will be included in the next version of OS X (meaning I’ll have to shell out another wad of cash to upgrade to OS X Leopard before I can use a release version of the Boot Camp technology). I’m also wary of first generation MacIntel hardware and would like to see support for Windows XP Media Center Edition, so guess I’ll be watching this space for a little longer.

In the meantime, these links provide really useful information on the progress of Windows on a Mac:

For Mac users who fancy using Linux, there are some PowerPC Linux distros (like Yellow Dog Linux) and if you’re not convinced as to why you might want to use them (after all, isn’t OS X just another Unix operating system anyway?) I recommend Giles Turnbull’s article entitled why install Linux on your Mac? Then there’s the Mactel-Linux project to adapt Linux to MacIntel hardware as well as reports that Red Hat plan to include Intel-based Mac support in Fedora and a variety of sites claiming to have other distros working too. Whilst it sounds a bit of a mess (chain-loading LILO via NTLDR), there’s also a triple-boot solution (OS X/XP/Linux) using Boot Camp (from the OnMac guys).

Finally, for those who want to play this the other way around and run OS X on a PC, there’s the OSx86 project.

Why webstats are so interesting

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

I’ve been writing this blog for a couple of years now. With over 500 posts, it’s consumed a scary amount of my time, but at least it’s finally something useful to do with the markwilson.co.uk domain that I first registered back in the late ’90s when I was thinking of leaving my job and working as a freelance IT contractor!

Over time I’ve tried to move towards a standards-compliant website, with lots of information that people find useful. I’ve still got some way to go – not being a developer, my code is not as standards-compliant as I’d like it to be (although the website that I have been working on recently with my buddy Alex should soon be pretty much there from a CSS and XHTML standpoint) and the usefulness of the content is totally subjective (but the blog started out as a dumping ground for my notes and that’s still its primary purpose – if others find it useful then that’s great and the trickle of Google AdSense/PayPal revenue is always welcome).

From time to time I look at the website statistics (webstats) for the site and always find them an interesting read. I can’t claim to be an expert in search engine optimisation (nor do I want to be) but the Webalizer webstats that my ISP provides are great because they let me see:

  • How many hits I’m getting (not surprisingly I get more hits after I post new articles and less when I’m busy with work or other projects) on a monthly, daily or hourly basis.
  • The HTTP response codes that Apache dishes out (200s are good, 404s are bad).
  • The top 30 URLs that are being hit (not surprisingly /blog is number 1, but it also helps to see pages that account for lots of bandwidth but not much traffic – the ones where maybe I should be looking at optimising the code).
  • Entry and exit pages (there’s a big correlation between these two, so obviously I’m not encouraging enough browsing of the site).
  • Where people visit from (mostly crawlers, although unfortunately I can see how the stats are skewed by my own broadband connection at number 18 because I use the site so much to look things up for myself).
  • Who is referring visitors to me.
  • What people are looking for when they get referred here.
  • What browser people are using.
  • Where people are visiting from.

This information lets me understand which pages are most popular as well as highlighting technical issues with the site but it doesn’t always go far enough.

Some time ago, I applied for a Google Analytics (formerly Urchin) account and last week I finally set it up. Whilst the Webalizer stats are still useful in many ways for me as a website administrator, the Google Analytics information is much richer. For example, I no longer need my ClustrMaps map because I can see a geomap along with my pages per visit ratio, how many visitors return and who sends them here. For marketeers there are tools to track campaigns and see how they are progressing, and I can also find a whole load of technical information about my visitors (e.g. connection speed used, browser, platform, java and flash capabilities, language, screen colours and resolution – all of which can help in decisions as to what features should be incorporated in future). There’s also information about how long visitors spent viewing a particular page (in fact there are so many reports that I can’t list them all here).

So, what have I learned from all of this – well, from Google Analytics I can see that most of you have a broadband connection, are using Windows (94%), IE (65%, vs. 29% for Firefox), view the site in 32-bit colour and have a screen resolution of 1024×768. That means that most of you should be able to see the content as I intended. I also know that people tend to visit a single page and then leave the site and that Google is my main referrer. Webalizer tells me that Apache gave a strange error 405 to someone this month (somebody obviously tried to do something they shouldn’t be trying to do) but also some 404s (so maybe I have some broken links to investigate). I can also tell that (for the IP addresses that could be resolved) most of my visitors were from Western Europe or the United States but hello to everyone who has visited this month from Australia, China, India, Japan, Malaysia, New Zealand, Pakistan, Saudi Arabia, Singapore, South Africa, South Korea, Thailand, and the United Arab Emirates.

I hope this has illustrated how website statistics can be useful, even for small-time website operators like me and I encourage you to check out Webalizer (which reads Apache web server log files) and Google Analytics (which needs some JavaScript to be added to the website code). Alternatives (e.g. for IIS users) include AWstats and Christopher Heng also has a list of free web statistics and web log analysers on his site.

Introduction to blogging

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

The chances are, that if you’re reading this, you already know what a blog is. You may even know about RSS (or Atom). But for anyone who’s just stumbled across this site, Microsoft MVP Sandi Hardmeier has published a Blogging 101 that’s a really good introduction to what it’s all about.

Deleting files with CRC errors in Windows XP

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

I just fixed a little problem on my Windows XP laptop… I had a file which I could not delete (even after a reboot) and each time I tried, the error returned was:

Cannot delete filename: Data Error (Cyclic Redundancy Check)

Various Internet sites suggested rebooting in safe mode and removing the file – that didn’t work but chkdsk /r located the bad disk sectors and recovered the data. Once this was complete, I successfully removed the file.

If you have to do this, be ready for the chkdsk process to take a while.

Microsoft sets virtualisation free

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

Occasionally I blog about IT news items that interest me but I can’t cover everything (or even everything in my field of interest) due to time constraints. One thing I didn’t mention when the news broke a few weeks back was Microsoft’s release of Virtual Server 2005 R2 as a free download. This follows on from Microsoft’s licensing changes for Windows Server 2003 R2 Enterprise Edition and VMware’s move to make VMware Server (formerly VMware GSX) a free of charge product.

Interestingly, Microsoft has also released virtual machine additions for certain Linux distributions, which I feel is a real sign that Virtual Server is ready to take on VMware Server (don’t compare Virtual Server with Virtual PC – despite their virtual machine compatibility the two products are worlds apart). I’m not saying that Virtual Server is best for every situation – in many ways the VMware products are more mature – but Virtual Server is a serious option for those organisations running predominantly Microsoft environments.

We can also expect to see Virtual Server 2005 R2 service pack 1 released in early 2007 (a beta is due later this year), providing support for virtualisation in hardware. Further out, virtualisation software will move into the operating system within the Longhorn Server timeframe (along with Microsoft finally releasing a competitor to VMware ESX Server – codenamed Viridian).

Restoring the Windows XP master boot record after removing Linux

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

A few weeks back, I blogged about my problems installing Linux on an IBM ThinkPad. Because I’d like to get the Access IBM predesktop area back (and then install Linux so the system will dual-boot with Windows XP), I used the recovery CDs that IBM sent me (free of charge as the system is under warranty).

Initially, recovery failed due to a lack of free space, so I deleted the existing partition (using an MS-DOS boot disk and fdisk) before attempting recovery once more. This time the files were copied to the hard disk but after rebooting, I was greeted with a GRUB error:

GRUB Loading stage1.5…

GRUB loading, please wait…
Error 22

GRUB error 22 means “no such partition” – basically I needed to restore the Windows XP master boot record.

To access this, I booted the system from a Windows XP CD, waited for files to be loaded into memory, then selected R for recovery console, selected my Windows XP installation and entered the administrator password.

Once inside the Windows XP recovery console, I tried the fixboot command. This didn’t seem to make any difference on reboot, so I tried again with fixmbr. After another reboot, Windows XP was up and running (some Internet sites suggest fdisk /mbr but that’s not a recovery console command under Windows XP).

Unfortunately I still haven’t managed to restore the Access IBM predesktop area (all IBM say is “it should have been restored by the restore CDs”) – if I ever manage to resolve that one, I’ll post the results here.

Maximising Active Directory performance and replication troubleshooting

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

Whenever I see that John Craddock and Sally Storey (from Kimberry Associates) are presenting a new seminar on behalf of Microsoft I try to attend. Not just because the content is well presented but because of the relevance of the material, some of which would involve wading through reams of white papers to find, but most importantly because, unlike most Microsoft presentations, there’s hardly any marketing material in there.

Last week, I saw John and Sally present on maximising Active Directory (AD) performance and in-depth replication troubleshooting. With a packed day of presentations there was too much good information (and detail) to capture in a single blog post, but what follows should be of interest to anyone looking to improve the performance of their AD infrastructure.

At the heart of Active Directory is the local security authority (LSA – lsass.exe), responsible for running AD as well as:

  • Netlogon service.
  • Security Accounts Manager service.
  • LSA Server service.
  • Secure sockets layer (SSL).
  • Kerberos v5 authentication.
  • NTLM authentication.

From examining this list of vital services, it becomes clear that tuning the LSA is key to maximising AD performance.

Sizing domain controllers (DCs) can be tricky. On the one hand, a DC can be very lightly loaded but at peak periods, or on a busy infrastructure, DC responsiveness will be key to the overall perception of system performance. The Windows Server 2003 deployment guide provides guidance on sizing DCs; however it will also be necessary to monitor the system to evaluate performance, predict future requirements and plan for upgrades.

Performance monitor is a perfectly adequate tool but running it can affect performance significantly so event tracing for Windows (ETW – as described by Matt Pietrek) was developed for use on production systems. ETW uses a system of providers to pass events to event tracing sessions in memory. These event tracing sessions are controlled by one or more controllers, logging events to files, which can be played back to consumers (or alternatively the consumers can operate real-time traces). Microsoft server performance advisor is a free download which makes use of ETW providing a set of predefined collectors.

Some of the basic items to look at when considering DC performance are:

  • Memory – cache the database for optimised performance.
  • Disk I/O – the database and log files should reside on separate physical hard disks (with the logs on the fastest disks).
  • Data storage – do applications really need to store data in Active Directory or would ADAM represent a better solution?
  • Code optimisation – how do the directory-enabled applications use and store their data; and how do they search?

In order to understand search optimisation, there are some technologies that need to be examined further:

  • Ambiguous name resolution (ANR) is a search algorithm used to match an input string to any of the attributes defined in the ANR set, which by default includes givenName, sn, displayName, sAMAccountName and other attributes.
  • Medial indexing (*string*) and final string searching (*string) are slow. Windows Server 2003 (but not Windows 2000 Server) supports tuple indexing, which improves perfomance when searching for final string character strings of at least three characters in length. Unfortunately, tuple indexing should be used sparingly because it does degrade performance when the indexes are updated.
  • Optimising searches involves defining a correct scope for the search, ensuring that regularly-searched attributes are indexed, basing searches on object category rather than class, adjusting the ANR as required and indexing for container and medial searches where required.
  • Logging can be used to identify expensive (more than a defined number of entries visited) and inefficient (searching a defined number of objects returns less than 10% of the entries visited) searches using the 15 Field Engineering value in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Diagnostics (as described in Microsoft knowledge base article 314980), combined with HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\Expensive Search Results Threshold (default is 10000) and HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\Inefficient Search Results Threshold (default is 1000).

However performant the directory is at returning results, the accuracy of those results is dependent upon replication (the process of making sre that the directory data is available throughout the enterprise).

The AD replication model is described as multimaster (i.e. changes can be made on any DC), loosely consistent (i.e. there is latency between changes being made and their availability throughout the enterprise, so it is impossible to tell if the directory is completely up-to-date at any one time – although urgent changes will be replicated immediately) and convergent (i.e. eventually all changes will propagate to all DCs, using a conflict resolution mechanism if required).

When considering replication, a topology and architecture needs to be considered that matches performance with available bandwidth, minimises latency, minimises replication traffic and responds appropriately to loosely connected systems. In order to match performance with available bandwidth and provide efficient replication, it is important to describe the network topology to AD:

  • Sites are islands of good connectivity (once described as LAN-speed connections, but more realistically areas of the network where replication traffic has no negative impact).
  • Subnet objects are created to associate network segments with a particular site (e.g. so that a Windows client can locate the nearest DC for logons, directory searching and DFS paths – known as site affinity). Client affinity is also used for logons, directory searching and DFS paths.
  • Site links characterise available bandwidth and cost (in whatever unit is chosen – money, latency, or something else).
  • Bridgehead servers are nominated (for each NC) to select the DC used for replication in an out of a site when replicating all NCs between DCs and replicating the system volume between DCs.

A communications transport is also required. Whilst intrasite communications use RPC for communications, intersite communications (i.e. over site links) can use IP (RPC) for synchronous inbound messaging or SMTP for asynchronous messaging; however SMTP can not be used for the domain NC, making SMTP useful when there is no RPC connection available (e.g. firewall restrictions) but also meaning that RPC is required to build a forest. Intersite communications are compressed by default but Windows 2000 Server and Windows Server 2003 use different compression methods – the Windows Server 2003 version is much faster, but does not compress as effectively. In reality this is not a problem as long as the link speed is 64Kbps or greater but there are also options to revert to Windows 2000 Server compression or to disable compression altogether.

Site link bridges can be used to allow transitive replication (where a DC needs to replicate with a its partner via another site). The default is for all links to be bridges; however there must be a valid schedule on both portions of the link in order for replication to take place.

The knowledge consistency checker (KCC), which runs every 15 minutes by default, is responsible for building the replication topology, based on the information provided in the configuration container. The KCC needs to know about:

  • Sites.
  • Servers and site affinity.
  • Global catalog (GC) servers.
  • Which directory partitions are hosted on each server.
  • Site links and bridges.

For intrasite replication, the KCC runs on each DC within a site and each DC calculates its own inbound replication partners, constructing a ring topology with dual replication paths (for fault tolerance). The order of the ring is based on the numerical value of the DSA GUID and the maximum hop count between servers is three, so additional optimising connectors are created where required. Replication of a naming context (NC) can only take place via servers that hold a copy of that NC. In addition, one or more partial NCs will need to be replicated to a GC.

Because each connection object created by the KCC defines an inbound connection from a specific source DC, the destination server needs to create a partnership with the source in order to replicate changes. This process works as follows:

  1. The KCC creates the required connections.
  2. The repsFrom attribute is populated by the KCC for all common NCs (i.e. the DC learns about inbound connections).
  3. The destination server requests updates, allowing the repsTo attribute to be populated at the source.

The repsTo attribute is used to send notifications for intrasite replication; however all DCs periodically poll their partners in case any changes are missed, based on a schedule defined in the NTDS Settings object for the site and a system of notification delays to avoid all servers communicating changes at the same time.

Windows 2000 Server stores the initial notification delay (default 5 minutes) and subsequent notification delay (default 30 seconds) in the registry; whereas Windows Server 2003 stores the initial notification delay (default 15 seconds) and subsequent notification delay (default 3 seconds) within AD (although individual DCs can be controlled via registry settings). This is further complicated by the fact that Windows 2000 Server DCs upgraded to Windows Server 2003 and still running at Windows 2000 forest functionality will take the Windows 2000 timings until the forest functionality mode is increased to Windows Server 2003. This means that with a three hop maximum between DCs, the maximum time taken to replicate changes from one DC within a site to another is 45 seconds for Windows Server 2003 forests and 15 minutes for Windows 2000 Server forest (based on three times the initial notification delay).

For certain events, a process known as urgent replication (with no initial notification delay) is invoked. Events that trigger urgent replication are:

  • Account lockout.
  • Changes to the account lockout policy or domain password policy.
  • Changes to the computer account.
  • Changes to domain trust passwords.

Some changes are immediately sent to the PDC emulator via RPC (a process known as immediate replication) and most of these changes also trigger urgent replication:

  • User password changes.
  • Account lockout.
  • Changes to the RID master role.
  • Changes to LSA secrets.

For intersite replication, one DC in each site is designated as the inter-site topology generator (ISTG). The KCC runs on the ISTG to calculate the inbound site connections and the ISTG automatically selects bridgehead servers. By default, notifications are not used for intersite replication (which relies on a schedule instead); however it is also possible to create affinity between sites with a high bandwidth backbone connection by setting the least significant bit of the site link’s option attribute to 1.

Be aware that schedules are displayed in the local time, so if configuring schedules across time zones be aware that the time in once site will not match the time in the other. Also be aware that deleting a connector may orphan a DC (e.g. if replication has not completed fully and it has insufficient knowledge of the topology to establish a new connection).

Once the replication topology is established, a server needs to know what information it needs to replicate to its partners. This needs to include:

  • Just the data that has been changed.
  • All outstanding changes (even if a partner has been offline for an extended period).
  • Alternate replication paths (without data duplication).
  • Conflict resolution.

Each change to directory data is recorded as a unique sequence number (USN), written to the metadata for each individual attribute or link value. The USN is used as a high watermark vector for each inbound replication partner (and each NC), identified by their DSA GUID. The source server will sent all changes that have a higher USN. Because replication works on a ring topology, a process is required to stop unnecessary replication. This is known as propagation dampening and relies on another value called the up-to-dateness vector (one for each DC where the information originated). This is used to ensure that the source server does not send changes that have already been received. The highest committed USN attribute holds the highest USN used on a particular server.

It is possible for the same attribute to be simultaneously updated at multiple locations so each DC checks that the replicated change is “newer” than the information it hold before accepting a change. It determines which change is more up-to-date, based on the replica version number, then the originating time stamp, and finally on the originating invocation ID (as a tie break).

Other replication issues include:

  • If an object is added to or moved to a container on one DC as the container is deleted on another DC then the object will be places in the LostAndFound container.
  • Adding or moving objects on different DCs can result in two objects with the same distinguished name (DN). In this case the newer object is retained and the other object name is appended with the object GUID.

It’s worth noting that when running in Windows Server 2003 forest functional mode significant reductions in replication traffic can be provided as changes to multi-value objects (e.g. group membership) are replicated at the value level rather than the attribute level. Not only does this reduce replication traffic but it allows groups to be created with more than 5000 users and avoids data loss when a group membership is edited on multiple DCs within the replication latency period.

If this has whetted your appetite for tuning AD (or if you’re having problems troubleshooting AD) then I recommend that you check out John and Sally’s Active Directory Forestry book (but beware – the book scores “extreme” on the authors’ own “geekometer scale”).

The rise of 64-bit Windows server computing

This content is 20 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

A few months back, when Microsoft announced that many of its forthcoming server products would be 64-bit only, I confessed that I don’t know a lot about 64-bit computing. That all changed last week when I attended two events (the first sponsored by Intel and the second by AMD) where the two main 64-bit platforms were discussed and Microsoft’s decision to follow the 64-bit path suddenly makes a lot of sense.

For some time now, memory has been a constraint on scaling-up server capacity. The 32-bit x86 architecture that we’ve been using for so long now is limited to a maximum of 4GB of addressable memory (232 = 4,294,967,296) – once a huge amount of memory but not so any more. Once you consider that on a Windows system, half of this is reserved for the system address space (1GB if the /3GB switch is used in boot.ini) the remaining private address space for each process starts to look a little tight (even when virtual memory is considered). Even though Windows Server 2003 brought improvements over Windows 2000 in its support for the /3GB switch, not all applications recognise the extra private address space that it provides and the corresponding reduction in the system address space can be problematic for the operating system. Some technologies, like physical address extension (PAE) and address windowing extensions (AWE) can increase the addressable space to 64GB, but for those of us who remember trying to squeeze applications into high memory back in the days of MS-DOS, that all sounds a little bit like emm386.exe!

The answer to this problem of memory addressing is 64-bit computing, which allows much more memory to be addressed (as shown in the table below) but unfortunately means that new servers and a switch to a 64-bit operating systems are required. For Windows users that means Windows XP Professional x64 Edition on the desktop and either Windows Server 2003 Standard, Enterprise or Datacenter x64 or Itanium Edition at the back end (Windows Server 2003 Standard Edition is not compatible with Intel’s Itanium processor family and Windows Server 2003 Web Edition is 32-bit only).

Memory limits with 32- and 64-bit architectures

It’s not all doom and gloom – for those who have recently invested in new servers, it’s worth noting that certain recent models of Intel’s Xeon and Pentium processors have included a technology called Intel Extended Memory 64 Technology (EM64T), so pretty much any server purchased from a tier 1 vendor (basically HP, IBM or Dell) over the last 12 months will have included 64-bit capabilities (with the exception of blade servers, which are generally still 32-bit for power consumption reasons – at least in the Intel space).

EM64T is effectively Intel’s port of the AMD x64 technology (AMD64, as used in the Opteron CPU line) and probably represents the simplest step towards 64-bit computing as Intel’s Itanium processor family (IPF) line uses a totally separate architecture called explicitly parallel instruction computing (EPIC).

In terms of application compatibility, 32-bit applications can run on a 64-bit Windows platform using a system called Windows on Windows 64 (WOW64). Although this is effectively emulating a 32-bit environment, the increased performance allowed with a 64-bit architecture means that performance is maintained and meanwhile, 64-bit applications on the same server can run natively. Itanium-based servers are an exception – they require an execution layer to convert the x64 emulation to EPIC, which can result in some degradation in 32-bit performance (which needs to be offset against Itanium 2’s high performance capabilities, e.g. when used natively as a database server). There are some caveats – Windows requires 64-bit device drivers (for many, hardware support is the main problem for 64-bit adoption) and there is no support for 16-bit applications (that means no MS-DOS support).

There is much discussion about which 64-bit model is “best”. There are many who think AMD are ahead of Intel (and AMD are happy to cite an Information Week article which suggests they are gaining market share), but then there are situations where Itanium 2’s processor architecture allows for high performance computing (cf. the main x64/EM64T advantage of increased memory addressability). Comparing Intel’s Xeon processors with EM64T and Itanium 2 models reveals that the Itanium supports 1024TB of external memory (cf. 1TB for the EM64T x64 implementation) as well as additional cache and general purpose registers but the main advantages with Itanium relate to the support for parallel processing. With a traditional architecture, source code is compiler to create sequential machine code and any parallel processing is reliant upon the best efforts of the operating system. EPIC compilers for the Itanium architecture produce machine code that is already optimised for parallel processing.

Whatever the processor model, both Intel and AMD’s marketing slides agree that CPU speed is no longer the bottleneck in improving server performance. A server is a complex system and performance is constrained by the component with the lowest performance. Identifying the bottleneck is an issue of memory bandwidth (how fast data can be moved to and from memory), memory latency (how fast data can be accessed), memory addressability (how much data can be accessed) and I/O performance (how much data must be moved). That’s why processor and server manufacturers also provide supporting technologies designed to increase performance – areas such as demand based power management, systems management, network throughput, memory management (the more memory that is added, the less reliable it is – hence the available of RAID and hot-swap memory technologies), expansion (e.g. PCI Express) and virtualisation.

Where AMD’s x64 processors shine is that their architecture is more efficient. Traditional front-side bus (FSB) computers rely on a component known as the northbridge. I/O and memory access share the same front side bus affecting memory bandwidth. Memory access is delayed because it must pass through the northbridge (memory latency). I/O performance is restricted due to bandwidth bottlenecks accessing attached devices. Additionally, in a multiprocessor system, each core competes for access to the same FSB, making the memory bandwidth issue worse.

Non-uniform memory access (NUMA) computers place the memory with the processor cores, avoiding the need to access memory via a northbridge. AMD’s Opteron processor allows a total address space of up to 256TB, allowing each CPU to access up to 6.4Gbps of dedicated memory bandwidth with a local memory latency of ~65ns (similar to L3 cache performance) and three HyperTransport connections, each allowing 8Gbps of I/O bandwidth.

Scaling out to multiple processors is straightforward with direct connections between processors using the 8Gbps HyperTransport (some OEMs, e.g. Fujitsu-Siemens, use this to allow two dual-CPU blade servers to be connected making a 4 CPU blade system).

Adding a second processor core results in higher performance and better throughput (higher density, lower latency and higher overall performance) but with equal power consumption. The power consumption issue is an important one, with data centre placement increasingly being related to sufficient power capacity and high density rack-based and blade servers leading to cooling issues.

AMD’s figures suggest that by switching from a dual-core 4-way Intel Xeon processor system (8 cores in total) to a similar AMD Opteron system results in a drop from 740W to 380W of power [source: AMD]. With 116,439 4-way servers shipped in Western Europe alone during 2005 [source: IDC], that represents a saving of 41.9MW per year.

In summary, 64-bit hardware has been shipping for a while now. Both Intel and AMD admit that there is more to processor performance than clock speed, and AMD’s direct connect architecture removes some of the bottlenecks associated with the traditional x86 architecture as well as reducing power consumption (and hence heat production). The transition to 64-bit software has also begun, with x64 servers (Opteron and EM64T) providing flexibility in the migration from a 32-bit operating system and associated applications.