Tag: Storage

Struggling with SATA

One of my PCs includes a Serial ATA (SATA) controller (Silicon Image SiI3112A SATALink – BIOS v4.2.83 and 32-bit Windows driver v1.3.68.0) together with a Seagate ST3500641AS (500GB SATA) disk. Both these devices were added in preparation for installing Windows Home Server (so I haven’t tried them with any other operating system, although I suspect the results would be similar) and I’ve been having trouble with the system’s stability – suffering occasional crashes (sometimes followed by an inability to find the disk) and frequently seeing the following errors in the event log:

Event Type: Error
Event Source: si3112
Event Category: None
Event ID: 9
Date: 13/05/2007
Time: 12:22:25
User: N/A
Computer: servername
Description:
The device, \Device\Scsi\si31121, did not respond within the timeout period.

Event Type: Error
Event Source: Disk
Event Category: None
Event ID: 11
Date: 13/05/2007
Time: 13:54:00
User: N/A
Computer: servername
Description:
The driver detected a controller error on \Device\Harddisk0.

The first message doesn’t mean much but following the link from Event Viewer to the Windows Help and Support Center indicated that the disk event ID 11 means IO_ERR_CONTROLLER_ERROR and can be caused by a loose cable. The controller card (bought last week) was supplied with a power cable but not an interface (data) cable, so I bought one at Maplin for Â£4.99. When I got home I found that the data cable connector housing made the connection too tight against the power cable, making it a slightly incorrect fit (although probably good enough). Armed with this new advice, I set off to buy another cable – this time for Â£2.99 from a local computer services company… a perfect fit, with a latching connection and less expensive (that’s why it pays to shop locally!). Unfortunately though, this new cable didn’t resolve my disk errors.

Googling the error messages hadn’t turned up much; however searching for the disk model number told me that my disk is actually 3Gbps-capable and that, even though SATA/300 devices should be compatible with SATA/150 controllers, there can be issues with legacy controllers when a technology called spread spectrum clocking (SSC) is enabled. Seagate supplies a utility to enable/disable SSC on their SATA drives bit it won’t run under Windows, so I created an MS-DOS 6.22 boot floppy disk (thanks to bootdisk.com) and ran the utility from MS-DOS. As it happens, SSC was already disabled on my disk but it was worth checking out. Another potential issue is the autonegotiation between SATA/300 and SATA/150 and, following the Seagate SATA troubleshooter, I found this advice:

“Some older 1.5Gbits/sec SATA cards do not support auto negotiation with newer 3.0Gbits/sec drives… Seagate Barracuda 3.0Gbit/sec drives can be forced to 1.5Gbits/sec to allow support with these older SATA cards.

To force the Seagate Barracuda 7200.9 drive to 1.5Gbits/sec mode, apply a jumper to the outer most pins of the jumper block…

This jumper block uses a 2mm jumper. This is the smaller of the standard jumper sizes.”

Seagate knowledge base article 3116

After digging around in my “box of PC bits and bobs”, I found a suitable jumper and applied it; however I followed the diagram in Seagate knowledge base article 2850 (which relates to certain Maxtor SATA drives):

Instead of this, subtley different one (which I found afterwards in the ST3500641AS Product Manual):

After having applied the jumper to the wrong pins, there were no more disk event ID 11 errors and, as it seems that those pins are for factory use only, I have no idea what they meant; however, after a few hours, I saw the si3112 event ID 9 errors return, so I decided to switch the jumper to the location in the second diagram. I won’t go into the details of what heppened next, suffice to say it resulted in a blue screen of death, followed by a hard disk that no longer spun up and a warranty call… oops!

After receiving a replacement disk, I rebuilt the system (without any jumpers on the hard drive) and confirmed that the errors still occurred with a new disk (ruling out a faulty component as the cause). Then, I shut down the system (always a good idea before performing hardware maintenance) and fitted the jumper to the outermost two pins. Since powering on the computer, there have been no errors, so (fingers crossed), it looks as though the problem was down to a SATA/300 drive and a SATA/150 controller.

I’ve since come across a low-cost SATA controller with an eSATA port, based on a VIA VT6421A chipset (which could actually provide me with some more flexibility – and I can still return the first controller for a refund); however, having got a working driver and hardware combination, I’m reluctant to switch to another chipset (and another set of problems)… maybe that’s something to consider if I experience any more problems later.

Wednesday 16 May 2007
Recovering data after destroying the Mac OS X partition table
I’m not a religious man but every once in a while I do something stupid and find myself hoping for some divine intervention. Yesterday, I excelled in my stupidity with what was probably the single most careless thing that I have ever done in my entire computing life, accidentally re-initialising my external hard disk (containing, amongst other things, my iTunes library and irreplaceable digital photos of my children) and the backup disk.

In a mild state of panic, I called my friend Alex who gave me two excellent pieces of advice:
- Do nothing with the corrupted disks. Sit tight. Calm down. Wait and see what turns up from researching similar scenarios on the ‘net.
- Submit a post to some of the Mac forums (Mac OS X Hints, Apple Discussions, Mac Geekery) and see if anyone can recommend a suitable course of action.
Thank you Alex.

And thank you Stanley Horwitz, debaser626, Tom Larkin and Joe VanZandt for coming back to me with recommendations almost straightaway. Almost everyone suggested running a tool from Prosoft Engineering called Data Rescue II.

In addition to its primary role of recovering lost data on hard disks, this $99 utility (a small price in comparision to professional data recovery fees) has two especially important features: it is non-destructive as all restoration has to be to another volume; and it can be run in demo mode first to check that data is recoverable before having to be registered.

A quick scan turned up no recoverable files but a thorough scan was more useful. After a few hours reading 6 billion disk blocks and another couple analysing the data, it found my files. Unfortunately the progress bar gives no indication as to how many files might be recoverable whilst the scan is taking place, presumably because files may be spread across many disk blocks but it found my files!

The recovered files were marked as orphans, CBR (whatever that is) and then a whole load of them actually had their original file names and other metadata. After successfully recovering a single file, I bought a license and set about recovering the entire contents of the disk to another volume. Unfortunately it hung after failing to read one of the files but I repeated the operation (this time saving my scan results so that I can exit and relaunch the application if necessary) and successfully restored my digital photos. The relief is immense and I’m presently running a full restoration of the entire disk contents (I imagine that a large part of tomorrow will be spent working out which files I need, and which were actually deliberately deleted files recovered along with the lost data).

Other potentially useful tools, which I didn’t try but which might be useful to others, include:
- GRC Spinrite – for proactive hard disk maintenance and recovery of data from failing hard disks (disk recovery – not partition recovery)
- Alsoft DiskWarrior – for damaged directory structures (file system recovery – not partion recovery).
- SubRosaSoft File Salvage – another partition recovery tool.
Note that I haven’t tried any of these tools myself – I’m simply alerting any poor soul who stumbles across this page to their existence.

I was lucky. Very lucky.

The moral of this story – don’t rely on a single backup that is overwritten nightly and permanently connected to your computer. I really must take more frequent DVD backups of crucial files and store a disk backup offsite.

I should know better.
Thursday 10 May 2007
Creating a FAT32 volume in excess of 32GB

A few months back I wrote about some of the issues I was having with using FAT32-formatted disks for data transfer between Windows, Mac OS X (and Linux) PCs, because although FAT32 supports file systems up to 2TB in size, the format utilities within Windows support a maximum partition size of 32GB and FAT32 only supports files up to 4GB (which doesnâ€™t sound like an issue until you start copying .ISO DVD images and digital video files around).

Even though I use MacDrive for reading OS X disks on Windows XP, I still find it useful to have a FAT32 disk to back up the VMware Server virtual machine which I use to run Windows XP on a Linux notebook PC for my daily work. I did find a great utility a few weeks back for reading ext3 disks on Windows (I think it was Explore2fs), but it’s the universal acceptance of FAT32 that makes it so easy to use everywhere. The trouble is that my virtual machine is about 31GB in size and growing – consequently I needed to create a partition larger than 32GB.

In my original post, I mentioned that FAT32 volumes in excess of 32Gb can be created – Windows is able to read or write larger volumes it just can’t create them natively (the workaround is to use another operating system or third-party tools). In my case, I used the Mac OS X Disk Utility – the important point is to ensure that the disk options are set to use as master boot record (not a GUID partition table or an Apple partition map) after which MS-DOS File System becomes available as a formatting option, allowing me to create a FAT32 disk which filled my entire 55.89GB disk – plenty of room for my virtual machine files and more.

Saturday 7 April 2007
VMware ESX Server and HP MSA1500 – Active/Active or Active/Passive?

Recently, I’ve been working on a design for a virtual infrastructure, based on VMware Virtual Infrastructure 3 with HP ProLiant servers and a small SAN – an HP MSA1500cs with MSA30 (Ultra320 SCSI) and MSA20 (SATA) disk shelves.

The MSA is intended as a stopgap solution until we have an enterprise SAN in place but it’s an inexpensive workgroup solution which will allow us to get the virtual infrastructure up and running, providing a mixture of SATA LUNs (for VCB, disk images, templates, etc.) and SCSI LUNs (for production virtual machines). The MSA’s Achilles’ heel is the controller, which only provides a single 2Gbps fibre channel connection – a serious bottleneck. Whilst two MSA1500 controllers can be used, the default configuration is active-passive; however HP now has firmware for active-active configurations when used with certain operating systems – what was unclear to me was how VMware ESX Server would see this.

I asked the question in the VMTN community forums thread entitled Active-Active MSA controller config. with VI3 and MSA1500 and got some helpful responses indicating that an active-active configuration was possible; however as another users pointed out, the recommended most recently used (MRU) recommended path policy seemed to be at odds with VMware’s fixed path advice for active-active controller configurations.

Thanks to the instructor on my VMware training course this week, I learned that, although the MSA controllers are active-active (i.e. they are both up and running – rather than one of them remaining in standby mode), they are not active-active from a VMware perspective – i.e. each controller can present a different set of LUNs to the ESX server but there is only one path to a LUN at any one time. Therefore, to ESX Server they are still active-passive. I also found the following on another post which seems to have been removed from the VMTN site (at least, I couldn’t get the link from Google to work) but Google had a cached copy of it:

“The active/active description”… “seems to imply that they are active/active in the sense that both are doing work but perhaps driving different LUN’s? i.e. if you have 10 volumes defined you might have 5 driven by controller A and 5 driven by controller B. Should either A or B fail all ten are going to be driven by the surviving controller. This is active/active yes [but] this is also the definition of active/passive in ESX words (i.e. only one controller have access to one LUN at any given time).”

Based on the above quote, it seems that MSA1500 solutions can be used with VMware products in an active-active configuration (which should, theoretically, double the throughput) but the MRU recommended path policy must be used as only one controller can access as LUN at any given time.

Thursday 30 November 2006
Why RAID alone is not the answer for backups

I recently came across Gina Trapani’s article on the importance of backing up (the comments are worth a read too). I hear what she’s saying – a couple of years ago I very nearly lost a lot of data when a hard disk died and today I have far more important stuff on disk (like all of my recent photography – including irreplaceable pictures of my son – a digitised music collection and years’ worth of accumulated information), all spread across nearly a terabyte of separate devices.

As we place more and more emphasis on our digital lifestyle, the amount of data stored will continue to grow and that creates a problem, especially for home and small business users.

Optical media degrades over time and since the hard disk I bought for backups is now in daily use with my new Macintosh computer, I need to implement a decent backup regime. As disk sizes increase, a single disk seems like putting all my eggs in one basket, but I also hear people talking about how RAID is the answer.

No it’s not.

The most common RAID levels in use are 0 (striping), 1 (mirroring) and 5 (striped set with parity). RAID 0 does not provide any fault tolerance, RAID 5 needs at least 3 disks – too much for most home and home office setups – that leaves just RAID 1. Mirrors sometimes fail and when they do, they can take all of the data with them. Then there’s the additional issue of accidental damage (fire, flood, etc.). What’s really required (in a home scenario), is two or more removable hard disks, combined with use of a utility such as rsync (Unix) or SyncToy (Windows) to automate frequent backups, with one of the disks kept off site (e.g. with a family member) and frequent disk rotation.

In an enterprise environment I wouldn’t consider implementing a server without some form of RAID (and other redundant technologies) installed; however I’d also have a comprehensive backup strategy. For homes and small businesses RAID is not the answer – what’s really required is a means of easily ensuring that data is secured so that if a disaster should occur, then those precious files will not be lost forever.

Monday 17 July 2006
Configuring an HP MSA1000 using a serial cable
Earlier today, I needed to configure an HP StorageWorks Modular Storage Array (MSA) 1000 which I’ll be using for SAN storage over the next few weeks. Nothing too difficult about that, except that I wanted to access the SAN via the command line interface (CLI) and that meant using a serial cable to connect to the MSA controller. Each controller has what looks like an RJ45 Ethernet connection on the front, but standard Ethernet cables don’t fit. Fortunately I found the console cable that had been delivered with the SAN and found that it uses a slightly unusual variation of an RJ45 connection, which further research indicates is called an RJ45Z. The only noticeable difference (apart from how the connector is wired internally), is an extra notch on one side, as shown in the picture below:

Incidentally, once the connection is made from a standard RS232 serial port to the MSA (most modern notebook PCs don’t have a serial port – I had to use an IBM USB-serial/parallel adapter), accessing the CLI simply involves starting Windows HyperTerminal with the following connection settings:
- Connect using: comportidentifier (e.g. COM1)
- Bits per second: 19200
- Data bits: 8
- Parity: None
- Stop bits: 1
- Flow control: None
Once connected, it may be necessary to press the Enter key until the CLI> prompt is displayed, after which commands can be issued to configure the MSA.

Further details can be found in the HP StorageWorks Modular Smart Array 1000/1500cs Command Line Interface manual (just in case, like me, you found this information using Google before you got around to reading the manual that came on a CD with the MSA!).
Monday 3 July 2006
RAID and units of storage
A couple of weeks back, I commented that photography and IT are becoming ever-closer but last night I was amazed to open a copy of Digital Photographer magazine and find an article about redundant array of inexpensive disk (RAID) storage!

It was an interesting read and, because it assumed that the read wasn’t an IT professional, it gave a concise explanation of the various RAID levels (and storage capacities) which was actually a really good reference. It also referred to a number of websites with additional information – I’ve reproduced a couple of them here, along with an extra reference of my own:
From the same article, for those of us who have forgotten what the various (binary) units of storage are:
- A single byte is the most basic unit of computer storage.
- 1 kilobyte (KB) = 1024 bytes.
- 1 megabyte (MB) = 1024KB (1,045,576 bytes).
- 1 gigabyte (GB) = 1024MB (1,073,741,824 bytes).
- 1 terabyte (TB) = 1024GB (1,099,511,627,776 bytes).
- 1 petabyte (PB) = 1024TB (1,125,899,906,824,624 bytes).
- 1 exabyte (EB) = 1024PB (1,152,921,504,606,846,976 bytes).
- 1 zettabyte (ZB) = 1024EB (1,180,591,620,717,411,303,424 bytes).
- 1 yottabyte (YB) = 1024 ZB (1,208,925,819,614,629,174,706,176 bytes).
Wednesday 1 February 2006
Missing disk space

A few months back, I was chatting with my Dad about his PC (you know, one of those “family IT support desk” jobs) and he was wondering what had happened to all of his hard disk space. David Chernicoff has written an article for Windows IT Pro magazine about the case of the missing disk space and it’s worth a read. I certainly found it interesting – especially the bit about true sizing cf. disk manufacturers’ idea of storage units.

Friday 26 August 2005
(Probably) the smallest server in the world
This weekend, I set up my new network attached storage (NAS) unit, which may well qualify as one of the world’s smallest (and least expensive) servers. It’s a Linksys Network Storage Link for USB 2.0 Disk Drives (NSLU2), coupled with one of my ultra-portable external storage devices.

The NSLU2 is a low-cost device for converting any USB storage into NAS. It is basically a tiny Linux server with an 10/100 Ethernet port and two USB 2.0 connections (mine cost Â£59.99 from Amazon). What’s more, it seems to have developed quite a following with those who are hacking the device to make it a more useful Linux server.

The NSLU2 gets slated in a CNET review, but basically you get what you pay for and for this price I’m not sure that you can really go wrong. It seemed to me that most of the CNET feedback was from consumers (with limited technical knowledge) who expected to connect their FAT or NTFS-formatted USB disks and access them across the network. The NSLU2 won’t let you do that as it uses the Linux ext3 file system, but once formatted on the NSLU2 they should still be readable on a Windows system with an appropriate file system driver.

Having said that, Linksys do not help themselves and much of the negative feedback will be down to the terrible documentation supplied with the product. I needed to carry out some Internet research before I could get mine working using two important pieces of information:
- It initially uses an IP address of 192.168.1.77/24 (not DHCP). To change that using thesupplied software you need your client to be on the same subnet. Alternatively just go to http://192.168.1.77/ and it will launch straight into the web interface.
- The initial administration username and password are both set to “admin”.
I’m not going to provide a full review as there are some good ones out there already – the best ones that I’ve found have been at MacOS X Hints (concise) and at Tom’s Networking (more extensive).

Basically, for low-cost NAS, the NSLU2 is great; but it is definitely for a SOHO environment only, and I’m already looking at the Buffalo LinkStation Network Storage Center for when I need some more storage in a few months time. The main reason I didn’t go with the LinkStation from the start is that it’s a Â£220 investment and for Â£60 my NSLU2 will keep me going for a few months until it starts a new life as a Linux project.

Links

Linksys Network Storage Link for USB 2.0 Disk Drives
Linksys NSLU2 datasheet
Hacking the NSLU2: Part 1; Part 2; Part 3; Part 4; Part 5
Linux on the NSLU2
NSLU2 Linux
Buffalo LinkStation Network Storage Center
Sunday 10 October 2004
Ultra-portable external storage

I’ve found the solution to my portable storage needs: an old (20Gb) laptop hard disk from my internal IT support department and a cheap (Â£2.99 + postage) USB enclosure picked up from eBay.co.uk. One of my clients bought something similar a few months back and it has taken me this long to get hold of a suitable hard disk; but now I have a decent amount of portable storage that I can format with NTFS (or any other file system I choose) and transfer between PCs at home and work.

The enclosure I bought is mostly aluminium, with a single LED to indicate power and/or drive access, and is just big enough for a slimline (9.5mm) laptop disk drive. It has a Y-shaped connector cable, with two USB 2.0 connectors at the forked end and a proprietary connection at the other, which is used to power the unit. I’ve found that I need to use both connectors to draw enough power on a Compaq or Dell laptop (The Compaq and IBM desktop PCs I tried seem to work with just one connection). Supplied with a driver CD (for Windows 98), screws, and a mock-leather wallet, I had no problems getting Windows XP to recognise it (without any additional software), and whilst the disk I was given only spins at 4200 RPM, it seems plenty fast enough for my needs.

Thursday 7 October 2004

Tag: Storage

Links