Windows Server 2008 R2 Hyper-V crash turns out to be an Intel driver issue

A few weeks ago, I rebuilt a recently decommissioned server to run as an infrastructure test and development rig at home.  I installed Windows Server 2008 R2, enabled the Hyper-V role and all was good until I started to configure my networks, during which I experienced a “blue screen of death” (BSOD) – never a good thing on your virtualisation host, especially when it does the same thing again on reboot:

“Oh dear, my freshly built Windows Server 2008 R2 machine has just thrown 3 BSODs in a row… after running normally for an hour or so :-(“

The server is a Dell PowerEdge 840 (a small, workgroup server that I bought a couple of years ago) with 8GB RAM and a quad core Xeon CPU.  The hardware is nothing special – but fine for my infrastructure testing – and it had been running with Windows Server 2008 Hyper-V since new (with no issues) but this was the first time I’d tried R2. 

I have 3 network adapters in the server: a built in Broadcom NetXtreme Gigabit card (which I’ve reserved for remote access); and 2 Intel PRO/100s (for VM workloads).  Ideally I’d use Gigabit Ethernet cards for the VM workload too, but this is only my home network and they were what I had available!

Trying to find out the cause of the problem, I ran WhoCrashed, which gave me the following information:

This was likely caused by the following module: efe5b32e.sys
Bugcheck code: 0xD1 (0x0, 0x2, 0x0, 0xFFFFF88002C4A3F1)
Error: DRIVER_IRQL_NOT_LESS_OR_EQUAL
Dump file: C:\Windows\Minidump\020410-15397-01.dmp
file path: C:\Windows\system32\drivers\efe5b32e.sys
product: Intel(R) PRO/100 Adapter
company: Intel Corporation
description: Intel(R) PRO/100 Adapter NDIS 5.1 driver

That confirmed that the issue was with the Intel NIC driver, which sounded right as, after enabling the Hyper-V role, I connected an Ethernet cable to one of the Intel NICs and got a BSOD each time the server came up. If I disconnected the cable, no BSOD.  Back to the twitters:

“Does anyone know of any problems with Intel NICs and Hyper-V R2 (that might cause a BSOD)?”

I switched the in-box (Microsoft) drivers for some (older) Intel ones.  That didn’t fix things, so I switched back to the latest drivers.  Eventually I found that the issue was caused by the checkbox for “Allow management operating system to share this network adapter” and that,  if the NIC is live and I selected this, I could reproduce the error:

“Found the source of yesterday’s WS08R2 Hyper-V crash… any idea why enabling this option http://twitpic.com/11b64y would trip a BSOD?”

Even though I could work around the issue (because I don’t want to share a NIC between the parent partition and the children anyway – I have the Broadcom NIC for remote access) it seemed strange that this behaviour should occur.  There was no NIC teaming involved and the server was still a straightforward UK installation (aside from enabling Hyper-V and setting up virtual networks). 

Based on suggestions from other Virtual Machine MVPs I also:

  • Flashed the NICs to the latest release of the Intel Boot Agent (these cards don’t have a BIOS).
  • Updated the Broadcom NIC to the latest drivers too.
  • Attempted to turn off Jumbo frames but the the option was not available in the properties so I could rule that out.

Thankfully, @stufox (from Microsoft in New Zealand) saw my tweets and was kind enough to step in to offer assistance.  It took us a few days, thanks to timezone differences and my work schedule, but we got there in the end.

First up, I sent Stu a minidump from the crash, which he worked on with one of the Windows Server kernel developers. They suggested running the driver verifier (verifier.exe) against the various physical network adapters (and against vmswitch.sys).  More details of this tool can be found in Microsoft knowledge base article 244617 but the response to the verifier /query command was as follows:

09/02/2010, 23:19:33
Level: 000009BB
RaiseIrqls: 0
AcquireSpinLocks: 44317
SynchronizeExecutions: 2
AllocationsAttempted: 152850
AllocationsSucceeded: 152850
AllocationsSucceededSpecialPool: 152850
AllocationsWithNoTag: 0
AllocationsFailed: 0
AllocationsFailedDeliberately: 0
Trims: 41047
UnTrackedPool: 141544
 
Verified drivers:
 
Name: efe5b32e.sys, loads: 1, unloads: 0
CurrentPagedPoolAllocations: 0
CurrentNonPagedPoolAllocations: 0
PeakPagedPoolAllocations: 0
PeakNonPagedPoolAllocations: 0
PagedPoolUsageInBytes: 0
NonPagedPoolUsageInBytes: 0
PeakPagedPoolUsageInBytes: 0
PeakNonPagedPoolUsageInBytes: 0
 
Name: ndis.sys, loads: 1, unloads: 0
CurrentPagedPoolAllocations: 6
CurrentNonPagedPoolAllocations: 1926
PeakPagedPoolAllocations: 8
PeakNonPagedPoolAllocations: 1928
PagedPoolUsageInBytes: 984
NonPagedPoolUsageInBytes: 1381456
PeakPagedPoolUsageInBytes: 1296
PeakNonPagedPoolUsageInBytes: 1381968
 
Name: b57nd60a.sys, loads: 1, unloads: 0
CurrentPagedPoolAllocations: 0
CurrentNonPagedPoolAllocations: 3
PeakPagedPoolAllocations: 0
PeakNonPagedPoolAllocations: 3
PagedPoolUsageInBytes: 0
NonPagedPoolUsageInBytes: 188448
PeakPagedPoolUsageInBytes: 0
PeakNonPagedPoolUsageInBytes: 188448
 
Name: vmswitch.sys, loads: 1, unloads: 0
CurrentPagedPoolAllocations: 1
CurrentNonPagedPoolAllocations: 18
PeakPagedPoolAllocations: 2
PeakNonPagedPoolAllocations: 24
PagedPoolUsageInBytes: 108
NonPagedPoolUsageInBytes: 50352
PeakPagedPoolUsageInBytes: 632
PeakNonPagedPoolUsageInBytes: 54464

To be honest, I haven’t a clue what half of that means but the guys at Microsoft did – and they also asked me for a kernel dump (Dirk A D Smith has written an article at Network World that gives a good description of the various types of memory dump: minidump; kernel; and full). Transmitting this file caused some issues (it was 256MB in size – too big for e-mail) but it compressed well, and 7-zip allowed me to split it into chunks to get under the 50GB file size limit on Windows Live SkyDrive.  Using this, Stu and his kernel developer colleagues were able to see that there is a bug in the Intel driver I’m using but it turns out there is another workaround too – turning off Large Send Offload in the network adapter properties.  Since I did this, the server has run without a hiccup (as I would have expected).

“Thanks to @stufox for helping me fix the BSOD on my Hyper-V R2 server. Turned out to be an Intel device driver issue – I will blog details”

It’s good to know that Hyper-V was not at fault here: sure, it shows that a rogue device driver can bring down a Windows system but that’s hardly breaking news – the good thing about the Hyper-V architecture is that I can easily update network device drivers.  And, let’s face it, I was running enterprise-class software on a workgroup server with some old, unsupported, hardware – you could say that I was asking for trouble…

Why Windows Server User Group meetings are a bit like London buses

There’s a saying in the UK when multiples of something come along at the same time… “like London buses… nothing and then three at once” (based on the principle of bunching, for high frequency services – incidentally, that’s an alien concept where I live – we’re lucky if the bus shows up at all…).

Anyway, back to the point – hot on the heels of the Windows Server User Group (WSUG) meeting with Joey Snow, Mark Parris has arranged a second meeting to co-incide with the Microsoft UK TechDays – this time it’s on 13 April 2010 at Microsoft’s Offices in London (map and directions) and the speaker will be Dan Pearson, from David Solomon’s Expert Seminars.

Dan was formerly a Senior Escalation Lead at Microsoft, where he worked in the Windows Base OS team supporting Microsoft customers. Dan will be talking about Windows crash dump analysis as well as Windows performance troubleshooting and analysis.

Check out the event registration site for more details.

User group meeting (Windows Server User Group)

A few weeks ago, I blogged about Microsoft’s UK TechDays and mentioned that the Windows Server User Group was planning to run an evening event that week.

Now we’re ready to announce the details: whilst he was at the MVP Summit last month, Mark Parris managed to persuade Joey Snow to come along and speak to us on the evening of 12 April 2010 at Microsoft’s Offices in London (map and directions).  Joey is a technical evangelist for the Worldwide Developer and Platform Evangelism team at Microsoft focusing on Windows Server, IIS and SQL Server and he’ll be presenting on Windows Server 2008 R2’s new BranchCache functionality in as well as migrating server roles to Windows Server 2008 R2.

Check out the event registration site for more details.

Distributing camera raw files along with their development history from Adobe Lightroom

I’ve written previously about how Adobe’s photo management applications such as Bridge and Lightroom use Sidecar (.XMP) files to store details of raw file edits without affecting the original image (and how that doesn’t quite work for JPEG or TIFF images).  On my system though, I found that there were no .XMP files because I had been storing the history inside my Lightroom catalog (I’ve since adjusted the catalog settings to automatically write changes into XMP) but it’s easy enough to generate an extensible metadata platform (.XMP) file for an image by either, exporting the image and selecting Original as the format in the file settings (this will save the .XMP file alongside the raw image), or, by selecting Save Metadata to File from the Metadata menu.  Either way, the resulting .XMP will be available for use in other applications (e.g. Bridge) and can be distributed with the raw image file if further processing is to be carried out on another computer.

Useful Links February 2010

A list of items I’ve come across recently that I found potentially useful, interesting, or just plain funny:

Cleaning my DSLR’s sensor… the quick (and inexpensive) way

Right now, I’m attending photography workshop in North Wales, learning a bit more about digital photographic imaging. It’s been a good experience so far but, yesterday afternoon, I experienced a small disaster as not only dust but a tiny hair had appeared on all of the images I took, indicating that I had some sort of debris on my sensor (actually, it’s on the anti-aliasing filter, not the sensor but that’s being pedantic…).

Being in the middle of the Snowdonia National Park (albeit in overcast/wet weather) and on a course where I would take a lot of photos, this was not exactly welcome and I feared I’d need a costly professional sensor clean (after a weekend of creating images with hair on them). No-one in the class had any sensor cleaning swabs (not that I’ve ever used them, and I would have been a little nervous too on my still-in-warranty Nikon D700) but, luckily, one of the guys passed me an air blower and said “try this – but make sure you hold the camera body face down as you use it!”.

With the mirror locked up, I puffed some air around inside the body (it’s important not to use compressed air for this) and took a reference image – thankfully the debris was gone (and, because the front of the camera was facing down, it should have fallen out, not gone further back into the camera).

I breathed a big sigh of relief and thanked my fellow classmate. In just over a week its the Focus on Imaging exhibition – hopefully I’ll get along to it and one of the items on my shopping list will be a Giottos Rocket Air Blower

Backing up and restoring Adobe Lightroom 2.x on a Mac

Over the last few days, I’ve been rebuilding the MacBook that I use for all my digital photography (which is a pretty risky thing to do immediately before heading off on a photography workshop) and one of the things I was pretty concerned about was backing up and restoring my Adobe Lightroom settings as these are at the heart of my workflow.

I store my images in two places (Lightroom backs them up to one of my Netgear ReadyNAS devices on import) and, on this occasion I’d also made two extra backups (I really should organise one in the cloud too, although syncing 130GB of images could take some time…).

I also backup the Lightroom catalog each time Lightroom runs (unfortunately the only option is to do this at startup, not shutdown), so that handles all of my keywords, develop settings, etc. What I needed to know was how to backup my preferences and presets – and how to restore everything.

It’s actually quite straightforward – this is how it worked for me – of course, I take no responsibility for anyone else’s backups and, as they say, your mileage may vary.  Also, PC users will find the process similar, but the file locations change:

I also made sure that the backups and restores were done at the same release (v2.3) but, once I was sure everything was working, I updated to the latest version (v2.6).

Checking if a computer supports Intel vPro/Active Management Technology (AMT)

One of my many activities over the last few days has been taking a look at whether my work notebook PC supports the Intel vPro/Active Management Technology (AMT) functionality (it doesn’t seem to).

Intel vPro/AMT adds out of band management capabilities to PC hardware, integrated into the CPU, chipset and network card (this animation shows more details) and is also a pre-requisite for Citrix XenClient which, at least until Microsoft gets itself in order with a decent client-side virtualisation solution, I was hoping to use as a solution for running multiple desktops on a single PC.  Sadly I don’t seem to have the necessary hardware.

Anyway, thanks to a very useful forum post by Amit Kulkarni, I found that there is a tool to check for the presence of AMT – in the AMT software development kit (SDK) is a discovery tool (discovery.exe), which can be used to scan the network for AMT devices.

Unfortunately, vPro/AMT only seems to be in the high-spec models for most OEMs right now… until then I’m stuck with hosted virtualisation solutions.

Removing crapware from my Mac

Over the last couple of days, I’ve been rebuilding my MacBook after an increasing number of “spinning beachballs of death” (the Mac equivalent of a Windows hourglass/doughnut/halo…).  Unfortunately, its not just PCs that come supplied with “crapware” – it may only be a couple of items but my OS X 10.5 installation also includes the Office for Mac 2004 Test Drive and iWork ’08 Trial.  As it happens, I do have a copy of Office for Mac 2008 but I don’t need it on this PC – indeed the whole reason for wiping clean and starting again was to have a lean, clean system for my photography, with the minimum of unnecessary clutter.

“What’s the problem?”, I hear you say, “isn’t uninstalling an application on a Mac as simple as dragging it to the trash?”  Well, in a word: no. Some apps for OS X are that simple to remove but many leave behind application support and preference files.  Some OS X apps have installers, just as on Windows PCs.

I ran the Remove Office application to remove the Office for Mac Test Drive and, after searching for installed copies of Office, it decided there were none, leaving Remove Office log.txt file on the desktop with the details of its search:

***************************
Found these items:
OFC2004_TD_FOLDERS: /Applications/Office 2004 for Mac Test Drive

It seems that, if you’ve not attempted to run any of the Test Drive apps (e.g. by opening an Office document), they are not actually installed.  Diane Ross has more details on her blog post on the subject but, basically, it’s safe to drag the Test Drive files and folders to the trash.

With Office for Mac out of the way, I turned my attention to the iWork ’08 Trial.  This does not have an uninstaller – the application files and folders for Keynote, Numbers and Pages can be dragged to the trash but there is another consideration – there are some iWork ’08 application support files in /Library/Application Support/ that may be removed too.

These resources might not be taking much space on my disk, but I don’t like the idea of remnants of an application hanging around – a clean system is a reliable system.  At least, that’s my experience on Windows and it shouldn’t be any different on a Mac.

Reading EXIF data to find out the number of shutter activations on a Nikon DSLR

A few years ago, I wrote about some digital photography utilities that I use on my Mac.  These days most of my post-processing is handled by Adobe Lightroom (which includes Adobe Camera Raw), with a bit of Photoshop CS4 (using plugins like Noise Ninja) for the high-end stuff but these tools still come in useful from time to time.  Unfortunately, Simple EXIF Viewer doesn’t work with Nikon raw images (.NEF files) and so it’s less useful to me than it once was.

Recently, I bought my wife a DSLR and, as I’m a Nikon user (I have a D700), it made sense that her body should fit my lenses so I picked up a Nikon refurbished D40 kit from London Camera Exchange.  Whilst the body looked new, I wanted to know how many times the shutter had been activated (DSLR shutter mechanisms have a limited life – about 50,000 for the D40) and the D40’s firmware won’t display this information – although it is captured in the EXIF data for each image.

After some googling, I found a link to Phil Harvey’s ExifTool, a platform independent library with a command line interface for accessing EXIF data in a variety of image formats. A few seconds later and I had run the exiftool -nikon dsc_0001.nef command (exiftool --? gives help) on a test image and it told me a perfectly respectable shutter count of 67.  For reference, I tried a similar command on some images from my late Father’s Canon EOS 1000D but shutter count was not one of the available metrics – even so the ExifTool provides a wealth of information from a variety of image formats.