Why RAID alone is not the answer for backups

I recently came across Gina Trapani’s article on the importance of backing up (the comments are worth a read too). I hear what she’s saying – a couple of years ago I very nearly lost a lot of data when a hard disk died and today I have far more important stuff on disk (like all of my recent photography – including irreplaceable pictures of my son – a digitised music collection and years’ worth of accumulated information), all spread across nearly a terabyte of separate devices.

As we place more and more emphasis on our digital lifestyle, the amount of data stored will continue to grow and that creates a problem, especially for home and small business users.

Optical media degrades over time and since the hard disk I bought for backups is now in daily use with my new Macintosh computer, I need to implement a decent backup regime. As disk sizes increase, a single disk seems like putting all my eggs in one basket, but I also hear people talking about how RAID is the answer.

No it’s not.

The most common RAID levels in use are 0 (striping), 1 (mirroring) and 5 (striped set with parity). RAID 0 does not provide any fault tolerance, RAID 5 needs at least 3 disks – too much for most home and home office setups – that leaves just RAID 1. Mirrors sometimes fail and when they do, they can take all of the data with them. Then there’s the additional issue of accidental damage (fire, flood, etc.). What’s really required (in a home scenario), is two or more removable hard disks, combined with use of a utility such as rsync (Unix) or SyncToy (Windows) to automate frequent backups, with one of the disks kept off site (e.g. with a family member) and frequent disk rotation.

In an enterprise environment I wouldn’t consider implementing a server without some form of RAID (and other redundant technologies) installed; however I’d also have a comprehensive backup strategy. For homes and small businesses RAID is not the answer – what’s really required is a means of easily ensuring that data is secured so that if a disaster should occur, then those precious files will not be lost forever.

Migrating an iTunes music library between PCs

Until recently, I’ve been running iTunes on a Windows XP PC but I’m in the process of migrating to a Mac OS X system. Whilst most data transfers have been straightforward, I found that, after moving the files to a disk that could be accessed by both the Mac and a PC (i.e. a FAT32-formatted external hard disk), getting iTunes to recognise my library was challenging. I’m sure it’s quite a common scenario so I thought I’d post what I did here so that it can be of use to others.

Whilst my scenario involved moving from iTunes 6.0.4 on a PC to 6.0.5 on a Mac, the principle is the same for moving iTunes music libraries between any two PCs (Mac OS X or Windows).

Apple’s advice for moving your iTunes Music folder is okay for moving files on the same system but their advice for switchers to transfer their iTunes Music Library files from PC to Mac just didn’t work for me (well, it works, sort of, but simply importing the music files into iTunes will lose playlists, history, ratings, etc. and importing the music library file itself will retrieve those items but won’t find the music files because they have moved location). I need to keep the selections because that’s how I determine what will be synchronised with my iPod – quite simply my 47GB music collection will not fit on a 4GB iPod Mini!

Luckily, the extensive article on moving iTunes libraries whilst preserving library data at the HiFi Blog gave me all the necessary steps (although they focus on libraries where iTunes is not used to organise the music – I let iTunes handle that for me). After setting iTunes preferences to point to a folder on my external hard disk (on the Advanced page, under General), I quit iTunes and edited the iTunes Music Library.xml and iTunes Library files that reside in Macintosh HD/Users/username/Music/iTunes/ (even though the music files are on the external hard disk, iTunes keeps its database files in the main user data location), removing all binary data from the iTunes Library file to leave a 0KB length file and replacing all instances of the original library location in iTunes Music Library.xml with the new library location (for me this was from file://localhost/C:Documents%20and%20Settings/Mark/My%20Documents/My%20Music/iTunes/iTunes%20Music/ to file://localhost/Volumes/EXTERNAL%20HD/Music/iTunes/iTunes%20Music/). I found the easiest way to edit the files was on the PC (using WordPad – depending on the size of the music library, NotePad may not cope with the file sizes involved). It’s also worth noting that on a PC, the iTunes Library file has a .ITL extension.

After making sure that the edited files were back in the Macintosh HD/Users/username/Music/iTunes folder and starting iTunes, I was greeted with an Importing iTunes Music Library.xml message before:

Organizing Files

The file “iTunes Library” does not appear to be a valid music library file. iTunes has attempted to recover your music library and has renamed this file to “iTunes Library (Damaged)”.

Actually that message is incorrect. On my system, there is no iTunes Library (Damaged) file but there is a Previous iTunes Libraries folder, which contains a file called iTunes Library 2006-7-12.

iTunes then continued to analyse and determine the song volume for 2344 of the 6766 items in my music library (I’m not sure what this actually means and it seems strange that it was not for the entire music collection) after which it was available for use as normal (almost) with all my tracks, playlists, selections, date last played, etc. I said almost normal because there are a couple of additional playlists (Podcasts and Videos) and the Podcast subscriptions don’t get migrated but that’s easy to fix. Again, it was the HiFi Blog article that helped me out – browse the library to view all music files with a genre of Podcast and drag them onto the Podcasts heading in the source column before clicking on resubscribe for each Podcast to enable new downloads (the existing downloads should all still be available).

The next step was to hook up my iPod which synchronised normally (I vaguely remember selecting that it was connected to a Windows PC the first time I set it up and expected to have to do some reconfiguration for the Mac but it seems that was not required). The only exception was for my purchased music, for which I received the following message:

Some of the songs in the iTunes music library, including the song “songname“, were not copied to the iPod “ipodname” because you are not authorised to play them on this computer.

I found this strange because I’d already accessed the iTunes Music store from iTunes using my Apple ID, and although there was a “Deauthorize Computer…” option on the Advanced menu, I couldn’t see an equivalent option to authorise it (so I naturally assumed it was already authorised). Attempting to access my purchased music in Front Row gave a better clue:

This computer is not authorized to play the selected song.

To authorize your computer, select the song in iTunes and enter the account name and password used to purchase the song from the iTunes Music Store.”

Sure enough, this did the trick, advising me that I had 2 out of a maximum of 5 computers authorised for my music and then allowing me to both play the purchased songs and synchronise them with my iPod.

After running with iTunes on my Mac for a few days now, everything seems to be working okay. The only remaining step is to deauthorise the original Windows XP PC from where I copied my music.

Microsoft’s digital identity metasystem

After months of hearing about Windows Vista eye candy (and hardly scraping the surface with anything of real substance with regards to the operating system platform), there seems to be a lot of talk about digital identity at Microsoft right now. A couple of weeks back I was at the Microsoft UK Security Summit, where I saw Kim Cameron (Microsoft’s Chief Architect for identity and access) give a presentation on CardSpace (formerly codenamed “InfoCard”) – a new identity metasystem contained within the Microsoft .NET Framework v3.0 (expected to be shipped with Windows Vista but also available for XP). Then, a couple of days ago, my copy of the July 2006 TechNet magazine arrived, themed around managing identity.

This is not the first time Microsoft has attempted to produce a digital identity management system. A few years back, Microsoft Passport was launched as a web service for identity management. But Passport didn’t work out (Kim Cameron refers to it as the world’s largest identity failure). The system works – 300 million people use it for accessing Microsoft services such as Hotmail and MSN Messenger, generating a billion logons each day – but people don’t want to have Microsoft controlling access to other Internet services (eBay used Passport for a while but dropped it in favour of their own access system).

Digital identity is, quite simply, a set of claims made about a subject (e.g. “My name is Mark Wilson”, “I work as a Senior Customer Solution Architect for Fujitsu Services”, “I live in the UK”, “my website is at http://www.markwilson.co.uk/”). Each of these claims may need to be verified before they are acted upon (e.g. a party to whom I am asserting my identity might like to check that I do indeed work where I say I do by contacting Fujitsu Services). We each have many identities for many uses that are required for transactions both in the real world and online. Indeed, all modern access technology is based on the concept of a digital identity (e.g. Kerberos and PKI both claim that the subject has a key showing their identity).

Microsoft’s latest identity metasystem learns from Passport – and interestingly, feedback gained via Kim Cameron’s identity weblog has been a major inspiration for CardSpace. Through the site, the identity community has established seven laws of identity:

  1. User control and consent.
  2. Minimal disclosure for a defined use.
  3. Justifiable parties.
  4. Directional identity.
  5. Pluralism of operators and technologies.
  6. Human integration.
  7. Consistent experience across contexts.

Another area where CardSpace fundamentally differs from Passport is that Microsoft is not going it alone this time – CardSpace is based on WS-* web services and other operating system vendors (e.g. Apple and Red Hat) are also working on comparable (and compatible) solutions. Indeed, the open source identity selector (OSIS) consortium has been formed to address this technology and Microsoft provides technical assistance to OSIS.

The idea of an identity metasystem is to unify access and prevent applications from the complexities of managing identity, but in a manner which is loosely coupled (i.e. allowing for multiple operators, technologies and implementations). Many others have compared this to the way in which TCP/IP unified network access, which paved the way for the connected systems that we have today.

The key players in an identity metasystem are:

  • Identity providers (who issue identities).
  • Subjects (individuals and entities about which claims are made).
  • Relying parties (require identities).

Each relying party will decide whether or not to act upon a claim, depending on information from an identity provider. In the real world scenario, that might be analogous to arriving at a client’s office and saying “Hello, I’m Mark Wilson from Fujitsu Services. I’m here to visit your IT Manager”. The security/reception staff may take my word for it (in which case this is self-issued identity and I am both the subject and the provider) or they may ask for further confirmation, such as my driving license, company identity card, or a letter/fax/e-mail inviting me to visit.

In a digital scenario the system works in a similar manner. When I log on to my PC, I enter my username to claim that I am Mark Wilson but the system will not allow access until I also supply a password that only Mark Wilson should know and my claims have been verified by a trusted identity provider (in this case the Active Directory domain controller, which confirms that the username and password combination matches the one it has stored for Mark Wilson). My workstation (the relying party) then allows me access to applications and data stored on the system.

In many ways a username and password combination is a bad identity analogy – we have trained users to trust websites that ask them to enter a password. Imagine what would happens if I was to set up a phishing site that asks for a password. Even if the correct password is entered then the site would claim that it was incorrect. A typical user (and I am probably one of those) will then try other passwords – the phishing site now has an extensive list of passwords available which can then be used to access other systems pretending to be the user whose identity has been stolen. A website may be protected by many thousands miles of secure communications but as Kim Cameron put it, the last one metre of the connection is from the computer to the user’s head (hence identity law number 6 – human integration) – identity systems need to be designed in a way that is easy for users to make sense of, whilst remaining secure.

CardSpace does this by presenting the user with a selection of digital identity cards (similar to the plastic cards in our wallets) and highlighting only those that are suitable for the site. Only publicly available information is stored with the card (so that should hold phishers at bay – the information to be gained is useless to them) and because each card is tagged with an image (and only appropriate cards are highlighted for use), I know that I have selected the correct identity (why would I send my Government Gateway identity to a site that claims to be my online bank?). Digital identities can also be combined with other access controls such as smartcards. The card itself is just a user-friendly selection mechanism – the actual data transmitted is XML-based.

CardSpace runs in a protected subsystem (similar to the Windows login screen) – so when active there is no possibility of another application (e.g. malware) gaining access to the system or of screenscraping taking place. In addition, user interaction is required before releasing the identity information.

Once selected, services that require identities can convert the supplied token between formats using the WS-Trust service for encapsulating protocol and claims transformation. For negotiations, WS-MetadataExchange and WS-SecurityPolicy are used. This makes the Microsoft implementation fully interoperable with other identity selector implementations, with other relying party implementations and with other identity provider implementations.

Microsoft is presently building a number of components to its identity metasystem:

  • CardSpace identity selector (usable by any application, included within .NET Framework v3.0 and hardened against tampering and spoofing).
  • CardSpace simple self-issued identity provider (makes use of strong PKI so that the user does not disclose passwords to relying parties).
  • Active Directory managed identity provider (to plug corporate users in to the metasystem via a full set of policy controls to manage the use of simple identities and Active Directory identities).
  • Windows Communication Foundation (for building distributed applications and implementing relying party services.

Post-Windows Vista, we can expect the Windows Login to be replaced with an CardSpace-based system. In the meantime, to find out more about Microsoft’s new identity metasystem, check out Kim Cameron’s identity blog, The Windows CardSpace pages and David Chappell’s Introducing InfoCard article on MSDN, and the July 2006 issue of TechNet magazine.