A quick introduction to HP ProLiant servers

Every now and again, I seem to find myself looking at HP’s ProLiant range of industry standard servers. The technology moves ahead but it’s pretty easy to understand where the various models sit in the range because of HP’s product naming system.

The basic principles have been the same for years – the “BMW” numbering scheme: 1 series, 3 series, 5 series and 7 series:

  • 1-series servers are entry level servers, targetted at the SMB and High Performance Computing markets, typically with fewer enterprise features (e.g. hot plug components) on board.
  • 3-series servers include HP’s 1U DL360 “pizza box” server and the ever-popular DL380 with 2 sockets and a range of storage and connectivity options.
  • 5-series servers are the 4-way machines for high-end appllication workloads, with plenty of internal storage and connectivity capacity.
  • The 7-series was discontinued for a while (as HP didn’t have an 8-way server) but, with increasing demands for powerful servers for consolidation/virtualisation, it was re-introduced with a DL785 that competes with other manufacturers’ servers such as the SunFire X4600.

The final digit is either a 0 (for an Intel server) or a 5 (for an AMD server). DL servers are rack-mountable (D for density), with ML for tower/freestanding servers, although some of these can also be converted to rack-mount. Each ML server is numbered 10 lower than its DL equivalent – so an ML370 is equivalent to a DL380.

A couple of years ago, HP launched its c-class blades and each blade server (prefixed with BL) was numbered as for the corresponding DL or ML server, but with 100 added to the model number – so a DL380 equivalent blade is a BL480c (c for c-class).

Finally, there’s a generation identifier (e.g. G5, G6). Each generation represents a step forward architecturally (e.g. a move from Ultra 320 to serial-attached SCSI disks, or the adoption of Intel’s latest “Nehalem” processors).

Once you know the system, it’s all pretty straightforward – and, as HP controls half the market for industry standard x64 servers, hopefully this blog post will be useful to someone who’s trying to get their head around it.

Can I fit a PCI expansion card into a different type of slot?

The new server that I bought recently has a huge case with loads of room for expansion and that got me thinking about all which components I already had that I could reuse.  DVD±RW dual layer recorder from the old PC, 500GB SATA hard disk from my external drive (swapped for the 250GB disk supplied with the server), couple of extra NICs… "oh, hang on.  Those NICs look like PCI cards and I can only see a single PCI slot.  Ah!".

I decided to RTFM and, according to the technical specifications, my server has 5 IO slots (all full-height, full length) as follows:

  • 2 x 64-bit 133MHz PCI-X
  • 1 x PCIe
  • 1 x 8 PCIe
  • 1 x 32-bit 33MHz legacy slot

I knew that the spare NICs I had were PCI cards but would they fit in a PCI-X slot?  Yes, as it happens.

I found a really useful article about avoiding PCI problems that explained to me the differences between various peripheral component interconnect (PCI) card formats and it turned out not to be as big an issue as I first thought.  It seems that the PCI specification allows for two signalling voltages (3.3v and 5v) as well as different bus widths (32- or 64-bit).  32-bit cards have 124 pins whilst 64-bit cards have 184 pins; however to indicate which signalling voltage is supported, notches are used at different points on the card – 5V cards have the notch at pin positions 50 and 51 whilst 3.3V cards have the notch closer to the backplate at pin positions 12 and 13.  Furthermore, some cards (like the NICs I wanted to use)Universal PCI card have notches in both positions (known as universal cards), indicating that they will work at either signalling voltage.  Meanwhile, PCI-X (PCI eXtended) is a development of PCI and, whilst offering higher speeds and a longer, 64-bit, connection slot, is also backwards-compatible with PCI cards allowing me to use my universal PCI card in a PCI-X slot (albeit slowing the whole bus down to 32-bit 33MHz).  PCIe (PCI Express) is a different standard, with a radically different connector and a serial (switched) architecture (HowStuffWorks has a great explanation of this).  My system has a single lane (1x) and an 8-lane (8x) connector, but 1x and 4x PCIe cards will work in the 8x slot.

PCI slotsThis illustration shows the various slot types on my motherboard, an 8x PCIe at the top, then a 1x PCIe, two 64-bit PCI-X slots and, finally, one legacy 32-bit 5V PCI slot.

After adding the extra NICs (one in the 32-bit legacy 33MHz slot and the other in one of the PCI-X slots) everything seemed to fit without resorting to the use of heavy tools and when I switched on the computer it seemed to boot up normally, without any pops, bangs or a puffs of smoke.  All that was needed was to get some drivers for Windows Server 2008 (these are old 100Mbps cards that have been sitting in my "box of PC bits" for a long time now).  Windows Device Manager reported the vendor and device IDs as 8086 and 1229 respectively (I already knew from the first half of the MAC address that these were Intel NICs), from which I could track down the vendor and device details and find that device 1229 is an 82550/1/7/8/9 EtherExpress PRO/100(B) Ethernet Adapter.  Despite this being a discontinued product, searching the Intel Download Center turned up a suitable Windows Vista (64-bit) driver that was backwards compatible with the Intel 82550 Fast Ethernet Controller and I soon had the NICs up and running in Windows Server 2008, reporting themselves as Intel PRO/100+ Management Adapters (including various custom property pages provided by Intel for teaming, VLAN support, link speed, power management and boot options).

So, it seems that, despite the variety of formats, not having exactly the right PCI slot is not necessarily an issue.  PCI Express is an entirely different issue but, for now, my 32-bit universal PCI card is working fine in a 64-bit PCI-X slot.

Controlling costs when buying a PC

Much of what I write on this blog is based upon my experiences as a consultant/infrastructure architect for a leading IT Services company but as my day job becomes less technical there’s an increasing amount written based on the time I spend working with technology on my home network.  markwilson.it doesn’t have a large IT budget and it’s become increasingly apparent that my pile of aging, but rather good, Compaq Evo D5xx SFF PCs no longer provides the hardware features that I need.  Consequently, I needed to buy a small server – ideally a dual-socket machine, with quad-core CPUs and up to 16GB of RAM (I had a couple of spare 500GB SATA hard disks to re-use and also some spare NICs). I didn’t have an exact figure in mind, but was hoping to spend no more than £500-600 and knew that specification would probably be too expensive but then I found an offer on the Dell website for a PowerEdge 840 – starting at £179 plus VAT and shipping.

The PowerEdge 840 is more like a workstation than a server but it has the option to upgrade the standard dual-core Intel Pentium processor to a dual- or quad-core Xeon.  For a dual-socket machine with the ability to go up to 16GB of memory I’d have needed to move up to a more expensive model but the PowerEdge 840 specification would give me what I need to set up a XenExpress or Windows Hyper-V Server server for evaluating products and running the basic markwilson.it infrastructure (everything except the website, which is hosted for me by ascomi).

I haven’t bought a PC for years (except my Mac Mini) and didn’t feel confident to make the right selections, so I called a trusted colleague, Garry, who may be a CTO now but still gets techie as required!  He gave me some great advice for keeping the costs down – and these are equally applicable to any major OEM PC/Server purchase:

  • Do you really need that optional extra?  Don’t be afraid to buy the basic system from the manufacturer and source additional components elsewhere. 
  • Use a cashback site (e.g. TopCashBack).  Using this, I was able to get cash back on the pre-tax value of the goods purchased.

Following Garry’s advice, I could see that by sticking with the basic memory and disk options (being ready to take them out when it arrives), I could buy larger components elsewhere and still save money.  I upped the CPU spec, but left the rest of the system almost at the basic level with 512MB of RAM, one 80GB disk (to take out and use elsewhere – or even just give it away) and one 250GB disk (to downgrade my external hard drive and use the 500GB disk from that instead).  I stuck with the basic DVD-ROM drive (I’ll take the writer from one of the PCs that this server will replace) and rejected options for floppy disk drive, modem, additional NICs, tape backup, UPS, etc. as I already have these things at home.  I also rejected options for RAID controllers as I can pick up a less capable but inexpensive SATA RAID controller elsewhere.  Finally, I said no to operating system and backup software – whilst Small Business Server 2003 R2 may have been nice, Windows Hyper-V Server will only be a few pounds when it releases later this year and XenExpress is free.  As I’m not in the market for Windows Server 2003/2008 Enterprise or Datacenter editions I’ll have to deal with the guest VM licensing separately – but that’s no different to my current situation.

After this, I had my server ready to order, at a total price (including shipping and VAT) of just under £392 (and around £14 cashback due).  Then I started looking at the last part of the solution – 8GB of DDR2 SDRAM.  Interestingly, although the PowerEdge 840 supports up to 8GB of RAM, the "build your system" website only lets me choose up to 4GB – in 1GB DIMMs.  I needed 2GB DIMMs in order to get the maximum memory installed.  Dell would sell me these at a competitive price of £49.02 each but wanted £22.33 to ship four of them to me (total price £218.40).  My normal memory supplier () could sell me two 4GB kits (each 2x2GB) for £244.38 (although the price has since fallen to just £164.48) and I could earn cashback on both of these offers but the non-ECC RAM available at sites like eBuyer was much less expensive.  According to Crucial, I can install non-ECC RAM in a server that supports ECC, but googling turned up some advice from newegg.com:

"[The] chances [of] a single-bit soft error occurring are about once per 1GB of memory per month of uninterrupted operation. Since most desktop computers do not run 24 hours a day, the chances are not actually that high. For example, if your computer (with 1GB of memory) runs 4 hours a day, the chances of a single-bit soft error happening (when your system is running) is about once every six months. Even should an error occur, it won’t be a big issue for most users as the error bit may not even be accessed at that time. Should the system access the error bit, this little error won’t result in a disaster either – the system may crash, but a restart of the system will fix that. That’s why ECC memory is not a necessity for most home users.

Things are very different when it comes to workstations and servers. To begin with, these systems often utilize multi-gigabytes of memory, and they usually run 24/7 as well. Both of these factors result in increased probability of a soft error. More importantly, an unnoticed error is not tolerable in a mission-critical workstation or server."

[Source: newegg.com article: Do I need ECC and registered memory?]

I decided to use ECC RAM as, even though my server is not mission critical, it will be on 24×7 and running several virtual machines – all of which could be susceptible to memory errors.  Dell Technical Support also advised me that non-ECC RAM would cause POST errors.

I still needed to get the best deal I could on memory and Dell were only quoting me for PC2-4200 (533MHz) RAM when the PowerEdge 840 will take PC2-5300 (667MHz) RAM.  As the Dell RAM is actually from Kingston, I decided to see what Kingston recommended for the PowerEdge 840 and found that there is a product (KTD-DM8400BE/2G) which is a 667MHz version of the 2GB 240-pin SDRAM that the server uses.  Furthermore, I could get it from SMC Direct for £53.55 a module (£219.58 for all four, shipped).  Then, Dell Technical Support advised me that there is little advantage in buying the faster RAM as the front-side bus for my Intel Xeon X3210 is 1066MHz and the 2x multiplier matches whereas the 667MHz RAM would match a 1333MHz bus.  On that basis, I stuck with the Dell quote, saving myself a fiver and potentially earning some more cashback (actually, I didn’t get cashback because it was a telephone purchase but that was more than offset by Dell waiving the shipping charge as goodwill for the problems I had experienced when buying the server).

So, here I am, a couple of weeks later, with a quad-core system, over a terabyte of storage (after swapping some disks around) and 8GB RAM for less than £575.  Buying my RAM separately (even from the same supplier) got me a better deal (although Crucial’s subsequent price drop would have made an even bigger difference) and Garry’s TopCashBack advice saved me some more money.  This is how the costs stack up (all including VAT and shipping, where applicable):

Item Price
Dell PowerEdge 840 server £391.98
4 Kingston KTD-DM8400AE/2G DIMMs (from Dell) £196.08
Additional disks, NICs and CD/DVD writer (sourced from existing PCs) £0.00
Cashback -£14.23
Total paid £573.83

Further information

Memory specification terms
Memory terms glossary

Online Cashback

How my Dell customer experience suddenly got better

Last week, I wrote a post about the poor customer service I had experienced as part of my recent Dell server purchase but the very next morning things started to improve.

Firstly, my server turned up a week early.  That’s good – exceeding customer expectations gets a big tick from me.  Ordered at 2pm on Friday, order accepted (i.e. payment cleared) on Sunday, server built to order, shipped from Ireland and delivered in England at 9.15am on Wednesday.  One happy punter.

Then I got an e-mail and a phone call from one of Dell’s Technical Account Managers, who’d seen my blog post and wanted to talk to me about my experience.  I was only to happy to give him feedback on where it all went wrong for me, and in return he promised to look into it and get a server specialist to call me right away.  Sure enough, a few minutes later the phone rang and it was a really helpful representative from the UK and Ireland SME Silver Support Team, who took me through the configuration options on my server that had confused me so much (I’ve added a comment to my original post with the details).

As a gesture of goodwill (and I think it’s only fair to disclose this as I’m now writing so positively about Dell!), they also waived the shipping charge on the extra memory I was about to purchase and shipped some additional SATA cables to allow me to connect a third and fourth drive to my motherboard.

All of that is good news for me but what about those who can’t publicly throw their toys out of the cot (i.e. write a stroppy post on their blog) and who need technical pre-sales support?  Dell’s advice is to either:

  • Click on the Request a Call link on the Server page before starting the system build;

or:

  • Click on the Purchase Help tab to view contact details for Sales Support.

(As this second option leads to the same page I used before ending up in Dell phone system hell I’d suggest the request a call option.)

Dell customer service fails again

A few years back I had the misfortune of using a Dell Latitude D600 notebook computer for my work.  At the time I wrote about the problems I experienced with Dell customer service and it seems that Michael Dell’s return as CEO isn’t doing much to improve the customer experience.

Then, last week, I bought a Dell PowerEdge 840 server.  I did it because it was cheap.  So, one might ask what am I complaining about but, even though £391.98 is very inexpensive for a server, I expect some service when I’m trying to buy something from someone.

I suppose I’m spoilt because normally I buy many servers at a time, have a technical account manager to help me select the right options and it’s someone else’s money if I miss something and need to buy some more components.  Oh yes, and I buy HP servers where possible.  This time I was spending my own money and wanted the best deal possible.

As I worked through Dell’s "build your system" website, I wanted some technical support for the RAID connectivity options which, after telling me that the server supports up to 2 cabled or hot-plug SAS or SATA hard drives, the website listed as:

  • C1B – Motherboard SATA cabled, min 2, max 2 Hard Drives connected to onboard SATA controller.
  • C1C – Motherboard SATA cabled, min 3, max 3 Hard Drives connected to onboard SATA controller.
  • C1D – Motherboard SATA cabled, min 4, max 4 Hard Drives connected to onboard SATA controller.

I was confused.  If the server only supports 2 cabled or hot-plug drives, then why is there a no-cost option to have 3 or 4 hard drives connected to the on-board SATA controller?  So I called Dell.  Only to find after about 8 (no kidding) menu options on the phone system that the "small business" department I needed to speak to was closed and only works from 9 to 4.30 Monday to Friday (part-timers…).

I bought the server anyway because the discount was due to expire (it’s since been extended) and called back on Monday. After making 4 menu selections I got to a person who was somewhere in South Asia and sounded helpful but was clearly following a script.  She redirected me to someone in Ireland who sounded annoyed that I was taking up her time and told me that my query was a technical one (not sales). She put me through to technical support, who were confused when I said that I didn’t have a service tag because my system was still being built but put me through to the PowerEdge department anyway.  They were busy but after 5 minutes on hold I spoke to a person who was helpful but didn’t really fill me with confidence in his advice as first of all, he told me that the PowerEdge 840 supports up to 4 drives (good) but that the options may be for different backplanes.  Then he checked and said that the system supports 2 drives on the motherboard but drives 3 and 4 would need a separate RAID controller.  As that seemed to contradict the options at purchase time and he couldn’t comment on the "build your system" website, I’m still no clearer.

I guess I’ll find out how many drives I can get in this server (and what the C1B/C/D options mean) when it arrives next week…

Sun Fire x64 servers… maybe worth a look?

Sun Microsystems would like me to use their x64 servers for my virtualisation platform (instead of the HP ProLiant DL585s that I’m currently using). Many of our conversations have been covered by a non-disclosure agreement so I can’t write much here, but the details of the current Sun Fire and Sun Blade x64 servers are in the public domain – and they are certainly worth a look.

I’ll need some pretty serious convincing to move away from our 100% HP ProLiant Windows server estate, especially as we use HP Systems Insight Manager for hardware monitoring and have had some issues in the past integrating hardware from other OEMs; however, the Sun servers do look pretty good – especially for anyone in the market for an 8-way server, where the Sun Fire x4600 is particularly impressive – I guess if HP had an equivalent box it would be called the ProLiant DL785. Sun also have 2-way servers (that would be positioned to compete with the HP ProLiant DL365) and a blade enclosure that’s broadly similar to the HP C-class blade enclosure, which I wrote about a few weeks back. Strangely though, there is a gap – with no 4-way equivalent to the HP ProLiant DL585. They all look to be pretty well engineered, with extra NIC capacity (4 ports as standard) as well as all the other features that could be expected on a modern server (management processor, redundant hot swap power supplies, separate airflows for components, etc.) and a service console port (something that administrators of Sun SPARC servers will have been used to for a while now). In fact, if I had any concerns, it would be about the delay in bringing new developments to market – for example the largest serial attached SCSI (SAS) hard disk drives current offered by Sun are 73GB, whereas some competitors have 146GB SAS drives available.

Sun are still a small player in the x86/x64 server space – but they are rapidly increasing their market share (revenue up 48% and market share up 0.7% year on year [source IDC]); however it should also be noted that market-leaders HP also saw modest growth over the same period. I’ll watch Sun’s progress with interest, and who knows, maybe soon I’ll be in a position to specify some Sun servers somewhere.

Notes on server hardware developments

I’ve just spent the day with HP, learning about their StorageWorks EVA SANs and the current ProLiant server roadmap. It was an interesting day, but most of what was discussed can be found on the HP website; however I did pick up some snippets of information that might be useful:

  • Firstly, when comparing Intel and AMD figures for the power consumption of their servers – if Intel quote the wattage, they quote the mean value, whereas AMD quote a peak figure – so it’s heard to draw accurate comparisons.
  • Secondly, as I reported when I wrote about HP blade servers a few weeks back, 3.5″ Ultra320 SCSI disks are being discontinued in favour of 2.5″ serial-attached SCSI (SAS) disks. The main difference (apart from the smaller form factor) is that SAS disks are switched between lanes (cf. a shared bus with Ultra320), increasing performance in a linear manner with each disk connected to a controller (whereas a shared SCSI channel will typically exhibit a bell-curve in its performance characteristics). Also, the smaller physical size of the disk means that a 10,000RPM 2.5″ disk will provide more-or-less equivalent performance to an similarly specified 15,000RPM 3.5″ disk and that less energy is required to spin it, meaning a lower power consumption (and less heat generated).
  • One of the other changes in the server lineup is a general move from PCI-X to PCI Express (PCIe) slots offering improved performance (many servers allow a combination of the two to be specified).
  • Finally, the new iLO2 management processors (as well as iLO with firmware v1.82 or later) now support schema-less AD integration and iLO2 has a much-improved remote console, with most of the Java code removed, increasing performance drastically.

There’s no real “story” to any of the above – they are just a jumble of notes that might be useful in understanding where HP (and other vendors) are heading in the industry standard x86/x64 server space.

Slicing server TCO with HP ProLiant server blades

Over the last few years, I’ve heard a lot about blade servers – mostly anecdotal – and mostly commenting that they produce a lot of heat so for all the rack space saved through their use, a lot of empty space needs to be left in datacentres that weren’t designed for today’s high-density servers. In recent weeks I’ve attended a number of events where HP was showcasing their new c-class blade system and it does look as if HP have addressed some of the issues with earlier systems. I’m actually pretty impressed – to the point where I’d seriously consider their use.

HP’s c-class blade system was introduced in June 2006 and new models are gradually coming on stream as HP replaces the earlier p-class blades, sold since 2002 (and expected to be retired in 2007).

HP c7000 blade enclosure - front

The new c-class enclosure requires 10U of rackspace, which can be configured with up to 16 half-height server blades, 8 full-height server blades or a combination of the two. Next week, HP will launch direct-attached storage blades (as well as new server blades) and next year, they expect to launch shared storage blades (similar to the StorageWorks Modular Storage Array products). With the ability to connect multiple blade enclosures to increase capacity, an extremely flexible (and efficient) computing resource pool can be created.

HP c7000 blade enclosure - rear

At the back of the enclosure are up to 10 cooling fans, 6 power connections (three-phase connections are also available), 1 or 2 management modules, and up to 8 interconnect modules (e.g. Ethernet or fibre-channel pass-through modules or switches from HP, Cisco, Brocade and others).

There are a number of fundamental changes between the p-class and c-class blade systems. Immediately apparent is that each blade is physically smaller. This has been facilitated by moving all cooling fans off the blade itself and into the enclosure, as well as by the move from Ultra320 SCSI hard disks to small form-factor serial-attached SCSI (SAS) hard disks. Although the SAS disks currently have reduced storage capacity (compared with Ultra320 SCSI), a 146GB 10,000RPM SAS disk will be launched next week and 15,000RPM disks will be released in the coming months. Serial ATA (SATA) disks are also available, but not recommended for 24×7 operation. The new disks use a 2.5″ form factor and weigh significantly less than 3.5″ disks; consequently they require about half as much power to provide equivalent performance.

HP are keen to point out that the new cooling arrangements are highly efficient, with three separate airflows through the enclosure for cooling blades, power supplies and communications devices. Using a parallel, redundant and scalable (PARSEC) architecture, the airflows include back-flow preventers and shut-off doors such that if a component is not installed, then that part of the enclosure is not cooled. If the Thermal Logic control algorithm detects that management information is not available (e.g. if the onboard management module is removed) then the variable speed Active Cool fans will fail open and automatically switch to full power – it really is impressive to see just how much air is pulled through the system by the fans, which are not dissimilar to tiny jet engines!

Power is another area where improvements have been made and instead of using a separate power supply modele, hotswap power supply units are now integrated into the front of the enclosure. The Thermal Logic system will dynamically adjust power and cooling to meet energy budgets such that instead of running multiple supplies at reduced power, some supplies are run close to full capacity (hence more efficiently) whilst others are not used. If one power supply fails, then the others will take up the load, with switching taking around 1ms.

Each blade server is a fully-functional HP ProLiant industry standard server – in fact the BL model numbering system mirrors the ML and DL range, adding 100 to the DL number, so a BL480c blade is equivalent to a DL380 rack-mount server (which itself adds 10 to the ML number – in this case an ML370).

Looking inside a blade, it becomes apparent how much space in a traditional server is taken up by power and cooling requirements – apart from the disks at the front, most of the unit consists of the main board with CPUs and memory. A mezzanine card arrangement is used to provide network or fibre-channel ports, which are connected via HP’s Virtual Connect architecture to the interconnect modules at the rear of the enclosure. This is the main restriction with a blade server – if PCI devices need to be employed, then traditional servers will be required; however each half-height blade can accommodate two mezzanine cards (each up to 2 fibre-channel or 4 Gigabit Ethernet ports) and a full-height blade can accommodate three mezzanine cards. Half-height blades also include 2 network connections as standard and full-height have 4 network connections – more than enough connectivity for most purposes. Each blade has between 2 and 4 hard disks and the direct attached storage blade will provide an additional 6 drives (SAS or SATA) in a half-height blade.

One of the advantages of using servers from tier 1 OEMs has always been the management functionality that’s built in (for years I argued that Compaq ProLiant servers cost more to buy but had a lower overall cost of ownership compared with other manufacturer’s servers) and HP are positioning the new blades in a similar way – that of reducing the total cost of ownership (even if the initial purchase price is slightly higher). Management features included within the blade include the onboard administrator console, with a HP Insight display at the front of the enclosure and up to two management modules at the rear in an active-standby configuration. The insight display is based on technology from HP printers and includes a chat function, e.g. for a remote administrator to send instructions to an engineer (predefined responses can be set or the engineer can respond in free text, but with just up, down and enter buttons it would take a considerable time to do so – worse than sending a text message on a mobile phone!).

Each server blade has an integrated lights out (iLO2) module, which is channelled via the onboard administrator console to allow remote management of the entire blade enclosure or the components within it – including real-time power and cooling control, device health and configuration (e.g. port mapping from blades to interconnect modules), and access to the iLO2 modules (console access via iLO2 seems much more responsive than previous generations, largely due to the removal of much of the Java technology). As with ML and DL ProLiant servers, each blade server includes the ProLiant Essentials foundation pack – part of which is the HP Systems Insight Manager toolset; with further packs building on this to provide additional functionality, such as rapid deployment, virtual machine management or server migration.

The Virtual Connect architecture between the blades and the interconnect modules removes much of the cabling associated with traditional servers. Offering a massive 5Tbps of bandwidth, the backplane needs to suffer four catastrophic failures before a port will become unavailable. In addition, it allows for hot spare blades to be provisioned, such that if one fails, then the network connections (along with MAC addresses, worldwide port numbers and fibre-channel boot parameters) are automatically re-routed to a spare that can be brought online – a technique known as server personality migration.

In terms of the break-even point for cost comparisons between blades and traditional servers, HP claim that it is between 3 and 8 blades, depending on the connectivity options (i.e. less than half an enclosure). They also point out that because the blade enclosure also includes connectivity then its not just server costs that need to be compared – the blade enclosure also replaces other parts of the IT infrastructure.

Of course, all of this relates to HP’s c-class blades and it’s still possible to purchase the old p-class HP blades, which use a totally different architecture. Other OEMs (e.g. Dell and IBM) also produce blade systems and I’d really like to see a universal enclosure that works with any OEM’s blade – in the same way that I can install any standard rack-format equipment in (almost) any rack today. Unfortunately, I can’t see that happening any time soon…

A tale of two CPU architectures

AMD Opteron Intel XeonLast week I wrote about the VMware infrastructure that I’m trying to put in place. I mentioned that my testing has been based on HP ProLiant DL585 servers – each of these is equipped with four dual-core AMD Opteron 8xx CPUs and a stack of memory. Half of the initial infrastructure will use new DL585s and the intention is that implementing these servers will release some recently-purchased HP ProLiant DL580G3s for an expansion of the infrastructure. Because the DL580G3 uses an Intel Xeon MP (formerly codenamed Paxville MP) processor, the difference in processor families requires a separation of the servers into two resource pools; however that’s not the real issue. My problem is justifying to an organisation that until now has exclusively used Intel processors that AMD units provide (as my CTO puts it) “more bang for our buck”.

The trouble is that the press is full of reports on how the new Intel Xeon 51xx CPUs (formerly codenamed Woodcrest) out-perform AMD Opterons, where AMD has been in the lead until now; but that’s in the 2-processor server space and I’m not hearing much about 4-way units.

All of this may change tomorrow as, at today’s VMware Beyond Boundaries virtualisation roadshow, Richard Curran, Director of Marketing for the Intel (EMEA) Digital Enterprise Group, hinted about an impending announcement; however an HP representative expressed a view that any new CPU will just be to plug the gap for a few months – the real performance boost will come in a few months time with the next generation of dual-core multiprocessor chips (in the same way that the Xeon 50xx chips, formerly codenamed Dempsey, preceded the 51xx Woodcrest).

Leaving aside any other server vendors, I need some direction as to which 4-way server to buy from HP. HP ProLiant DL580G3s would allow me to standardise but the newer HP ProLiant DL580G4s are more powerful – using the Xeon 71xx chips (formerly codenamed Tulsa) with Intel VT virtualisation support – and, based on list price, are significantly less expensive. Meanwhile, HP’s website claims that ProLiant DL585s are “the best performing x86 4-processor server in the industry” and they cost slightly less than a comparably-specified DL580G4 (again, based on list price), even before taking into account their lower power consumption.

Speaking to Intel, they (somewhat arrogantly) disregarded any reason why I should chose AMD; however AMD were more rational, explaining that regardless of the latest Intel benchmarks, an Opteron is technologically superior for a two main reasons: the hypertransport connection between processor cores; and the integrated memory controller (cf. Intel’s approach of using large volumes of level 3 cache), although the current generation of Opterons only use DDR RAM. Crucially though, AMD’s next-generation dual-core Opterons are socket-compatible with the forthcoming quad-core CPUs (socket F) and are in the same thermal envelope – allowing for processor upgrades – as well as using DDR2 memory and providing AMD-V virtualisation support (but in any case I’ll need to wait a few months for the HP ProLiant DL585G2 before I can buy a socket F-based Opteron 8xxx rack server from HP).

As my virtualisation platform is based on VMware products, I asked VMware which processor architecture they have found to be most performant (especially as the Opteron 8xx does not provide hardware support for virtualisation; although there are doubts about whether ESX Server 3.0 is ready to use such technology – I have read some reports that there will be an upgrade later). Unsurprisingly, VMware are sitting on the fence and will not favour one processor vendor over another (both AMD and Intel are valued business partners for VMware); of course, such comparisons would be subjective anyway but I need to know that I’m making the right purchasing decision. So I asked HP. Again, no-one will give me a written opinion but two HP representatives have expressed similar views verbally – AMD is still in the lead for 4-way servers, at least for the next few months.

There are other considerations too – DL580s feature redundant RAM (after power and disk, memory is the next most likely component to fail and whilst ECC can guard against single-bit failures, double-bit failures are harder to manage); however because the memory controller is integrated in each CPU for an AMD Opteron, there is no redundant RAM for a DL585.

Another consideration is the application load – even virtualised CPUs are perform differently under different workloads: for heavily cached applications (e.g. Microsoft SQL Server or SAP), an Intel architecture may provide the best performance; meanwhile CPU and memory-intensive tasks (e.g. Microsoft Exchange) are more suited to an AMD architecture.

So it seems that it really is “horses for courses” – maybe a split resource pool is the answer with one pool for heavily cached applications and another for CPU and memory-intensive applications. What I really hope is that I don’t regret the decision to follow the AMD path in a few months time… they used to say that “nobody ever got fired for buying IBM“. These days it seems to be the same story for buying Intel.

The rise of 64-bit Windows server computing

A few months back, when Microsoft announced that many of its forthcoming server products would be 64-bit only, I confessed that I don’t know a lot about 64-bit computing. That all changed last week when I attended two events (the first sponsored by Intel and the second by AMD) where the two main 64-bit platforms were discussed and Microsoft’s decision to follow the 64-bit path suddenly makes a lot of sense.

For some time now, memory has been a constraint on scaling-up server capacity. The 32-bit x86 architecture that we’ve been using for so long now is limited to a maximum of 4GB of addressable memory (232 = 4,294,967,296) – once a huge amount of memory but not so any more. Once you consider that on a Windows system, half of this is reserved for the system address space (1GB if the /3GB switch is used in boot.ini) the remaining private address space for each process starts to look a little tight (even when virtual memory is considered). Even though Windows Server 2003 brought improvements over Windows 2000 in its support for the /3GB switch, not all applications recognise the extra private address space that it provides and the corresponding reduction in the system address space can be problematic for the operating system. Some technologies, like physical address extension (PAE) and address windowing extensions (AWE) can increase the addressable space to 64GB, but for those of us who remember trying to squeeze applications into high memory back in the days of MS-DOS, that all sounds a little bit like emm386.exe!

The answer to this problem of memory addressing is 64-bit computing, which allows much more memory to be addressed (as shown in the table below) but unfortunately means that new servers and a switch to a 64-bit operating systems are required. For Windows users that means Windows XP Professional x64 Edition on the desktop and either Windows Server 2003 Standard, Enterprise or Datacenter x64 or Itanium Edition at the back end (Windows Server 2003 Standard Edition is not compatible with Intel’s Itanium processor family and Windows Server 2003 Web Edition is 32-bit only).

Memory limits with 32- and 64-bit architectures

It’s not all doom and gloom – for those who have recently invested in new servers, it’s worth noting that certain recent models of Intel’s Xeon and Pentium processors have included a technology called Intel Extended Memory 64 Technology (EM64T), so pretty much any server purchased from a tier 1 vendor (basically HP, IBM or Dell) over the last 12 months will have included 64-bit capabilities (with the exception of blade servers, which are generally still 32-bit for power consumption reasons – at least in the Intel space).

EM64T is effectively Intel’s port of the AMD x64 technology (AMD64, as used in the Opteron CPU line) and probably represents the simplest step towards 64-bit computing as Intel’s Itanium processor family (IPF) line uses a totally separate architecture called explicitly parallel instruction computing (EPIC).

In terms of application compatibility, 32-bit applications can run on a 64-bit Windows platform using a system called Windows on Windows 64 (WOW64). Although this is effectively emulating a 32-bit environment, the increased performance allowed with a 64-bit architecture means that performance is maintained and meanwhile, 64-bit applications on the same server can run natively. Itanium-based servers are an exception – they require an execution layer to convert the x64 emulation to EPIC, which can result in some degradation in 32-bit performance (which needs to be offset against Itanium 2’s high performance capabilities, e.g. when used natively as a database server). There are some caveats – Windows requires 64-bit device drivers (for many, hardware support is the main problem for 64-bit adoption) and there is no support for 16-bit applications (that means no MS-DOS support).

There is much discussion about which 64-bit model is “best”. There are many who think AMD are ahead of Intel (and AMD are happy to cite an Information Week article which suggests they are gaining market share), but then there are situations where Itanium 2’s processor architecture allows for high performance computing (cf. the main x64/EM64T advantage of increased memory addressability). Comparing Intel’s Xeon processors with EM64T and Itanium 2 models reveals that the Itanium supports 1024TB of external memory (cf. 1TB for the EM64T x64 implementation) as well as additional cache and general purpose registers but the main advantages with Itanium relate to the support for parallel processing. With a traditional architecture, source code is compiler to create sequential machine code and any parallel processing is reliant upon the best efforts of the operating system. EPIC compilers for the Itanium architecture produce machine code that is already optimised for parallel processing.

Whatever the processor model, both Intel and AMD’s marketing slides agree that CPU speed is no longer the bottleneck in improving server performance. A server is a complex system and performance is constrained by the component with the lowest performance. Identifying the bottleneck is an issue of memory bandwidth (how fast data can be moved to and from memory), memory latency (how fast data can be accessed), memory addressability (how much data can be accessed) and I/O performance (how much data must be moved). That’s why processor and server manufacturers also provide supporting technologies designed to increase performance – areas such as demand based power management, systems management, network throughput, memory management (the more memory that is added, the less reliable it is – hence the available of RAID and hot-swap memory technologies), expansion (e.g. PCI Express) and virtualisation.

Where AMD’s x64 processors shine is that their architecture is more efficient. Traditional front-side bus (FSB) computers rely on a component known as the northbridge. I/O and memory access share the same front side bus affecting memory bandwidth. Memory access is delayed because it must pass through the northbridge (memory latency). I/O performance is restricted due to bandwidth bottlenecks accessing attached devices. Additionally, in a multiprocessor system, each core competes for access to the same FSB, making the memory bandwidth issue worse.

Non-uniform memory access (NUMA) computers place the memory with the processor cores, avoiding the need to access memory via a northbridge. AMD’s Opteron processor allows a total address space of up to 256TB, allowing each CPU to access up to 6.4Gbps of dedicated memory bandwidth with a local memory latency of ~65ns (similar to L3 cache performance) and three HyperTransport connections, each allowing 8Gbps of I/O bandwidth.

Scaling out to multiple processors is straightforward with direct connections between processors using the 8Gbps HyperTransport (some OEMs, e.g. Fujitsu-Siemens, use this to allow two dual-CPU blade servers to be connected making a 4 CPU blade system).

Adding a second processor core results in higher performance and better throughput (higher density, lower latency and higher overall performance) but with equal power consumption. The power consumption issue is an important one, with data centre placement increasingly being related to sufficient power capacity and high density rack-based and blade servers leading to cooling issues.

AMD’s figures suggest that by switching from a dual-core 4-way Intel Xeon processor system (8 cores in total) to a similar AMD Opteron system results in a drop from 740W to 380W of power [source: AMD]. With 116,439 4-way servers shipped in Western Europe alone during 2005 [source: IDC], that represents a saving of 41.9MW per year.

In summary, 64-bit hardware has been shipping for a while now. Both Intel and AMD admit that there is more to processor performance than clock speed, and AMD’s direct connect architecture removes some of the bottlenecks associated with the traditional x86 architecture as well as reducing power consumption (and hence heat production). The transition to 64-bit software has also begun, with x64 servers (Opteron and EM64T) providing flexibility in the migration from a 32-bit operating system and associated applications.