HOWTO - Work with Disk Images
Warning! Extra Geeky Content Ahead!
What is a disk image?
The way I'm using it, a disk image is a bit-by-bit copy of the information on a data storage device. For a broader exploration of the topic, check out the "Disk image" article over at the Wikipedia.
I'll talk about hard disk images, and CD images (the infamous .iso
file).
Why would I care about disk images?
Maybe you wouldn't. Mostly, I think we want to interact with our disks through the standard filesystem tools to work with files.
There ARE several situations where working with disk images might be useful. Among them might be:
- Downloading and burning the latest Ubuntu Live CD.
- Copying a CD, to distribute or to have a backup copy just in case.
- Accessing the contents of these CD images without burning them onto CD.
- Duplicating smaller disks (e.g. SD cards, CompactFlash cards)
- Saving a perfect data copy of a disk as a first step in sensitive situations involving data (e.g., data recovery or forensics work).
- Keeping an archival copy of a disk (e.g. I have a disk image of the 120MB hard disk from my old 386 computer for nostalgia purposes; one of these days I'm going to figure out how to emulate it...).
Note that while technically possible, imaging a large number of computers this way is a very long process compared to tools that are specifically designed for that sort of work (such as partimage
and the SystemImager suite).
What tools do we need?
In UNIX (and Linux, and BSD, OSX, etc.), we will use the dd
utility
to access data from the storage device directly (as opposed to via standard file manipulation, which is (thankfully) abstracted from direct access through concepts like partitioning and filesystems) in order to be able to read and write disk images.
dd
, or a version of it, is available for every platform, I believe.
I'll also discuss using the UNIX mount
utility to mount the disk images as if they were real disks. I believe Mac OSX has a similar functionality built right in, and I know there are all kinds of programs for mounting virtual drives in Windows.
Creating a Disk Image
You'll need the disk you're copying from to not be mounted. You'll also need a place to save the image file that has sufficient free space. This may mean:
- You're copying from a smaller disk to a larger disk—with more free space than the smaller disk has total capacity— neither of which is mounted on your computer.
- You're booted into a Live CD environment in order to have un-mounted access to your main (operating system) disk in order to be able to copy it off onto an external storage device, be that over USB or over the network.
Whatever the case, once you have things ready, the syntax we use is:
# dd if=input of=output
Where input
is the disk device node and output
is the file you want to write the disk's data to.
SO, if I want to take /dev/sda
and make a disk image of it called sda.dd
in the current working directory, I run:
# dd if=/dev/sda of=sda.dd
And now I have a file named sda.dd
which contains an exact bit-by-bit copy of my /dev/sda
disk!
Writing a Disk Image to a Disk
So, you created a disk image of your drive, and then you did something stupid and ruined the contents of your drive? or your drive died and you got a new one? or (more optimistically) you're just duplicating the hard disk and that's why you have an image? No worries, we can simply write the image right back onto that disk!
We'll need to have access to the disk image file, and the destination disk will need to be available and unmounted.
The pattern for dd stays the same:
# dd if=input of=output
Except now input
is the disk image file and and output
is the disk device node you want to write the disk's data to.
SO, if I want to take /dev/sda
and write a disk image called sda.dd
onto it, I run:
# dd if=sda.dd of=/dev/sda
Partitions
Incidentally, you can also do this with a partition rather than a full disk, by giving it a partition node instead of a disk node (e.g. /dev/sda1
is the first partition on /dev/sda
) so dd will create an image of just the first partition rather than one of the full disk).
CD Images
Compact Disc images have been made really easy to work with.
In Ubuntu, you need only to right-click on a .iso
file in order to be presented with the option to Write to Disc...
. Also, using Brasero, you can run a "Disc copy" project and copy your disc to a "File image".
Alternatively, on the command-line, CD images work just like you might expect from the above. In order to create them you can just:
# dd if=/dev/scd0 of=discimage.iso
And in order to burn the disc image to a blank disc, you can just reverse the direction:
# dd if=discimage.iso of=/dev/scd0
Accessing Disk Image Partitions
So, we can write the disk images onto disks...
We can also access and manipulate them directly by treating the image file as a disk. I'll be discussing the use of the UNIX mount
utility, which is responsible for mounting disks onto the UNIX file system, to do this.
The pattern for using mount (for this) is:
# mount -t fstype -o options device directory
Where:
- fstype is the type of filesystem you're trying to mount; necessary if you're working with non-native filesystems, like NTFS.
- options are extra options you may need to use, e.g.
loop
is the option we feedmount
to let it know we're feeding it a disk image file rather than a real disk. If we want to mount a partition from inside a full disk image, we'll also need to use theoffset
option to let it know where the partition starts. - device is the device (in our case disk image) to be mounted.
- directory is the destination directory in the UNIX file system where you want the image mounted.
So, in order to mount a CD iso, we just do:
# mount -o loop cdimage.iso /media/iso
And now running ls /media/iso
will show us the root directory of the CD.
On Ubuntu 8.04 (and others), so long as you mount it in the /media
directory, an icon will show up on the GNOME desktop to represent the "disc".
If we have the image of a disk partition, we can similarly mount it as above with:
# mount -o loop partition.img /media/partition
However, if we want to mount a partition from inside a full disk image, we'll first need to locate the spot in the image that the partition begins.
We need to use fdisk
to get the needed information out of the disk image, with the -l
option to list partition information and the -u
option to show us the sizes in sectors.
For example, as I was working on recovering data off of a teacher's dying hard drives, I got this response:
root@om:/mnt# fdisk -lu output You must set cylinders. You can do this from the extra functions menu. Disk output: 0 MB, 0 bytes 255 heads, 63 sectors/track, 0 cylinders, total 0 sectors Units = sectors of 1 * 512 = 512 bytes Disk identifier: 0x9dc96e9e Device Boot Start End Blocks Id System output1 63 80324 40131 de Dell Utility output2 * 80325 464840774 232380225 7 HPFS/NTFS Partition 2 has different physical/logical endings: phys=(1023, 254, 63) logical=(28934, 254, 63)
So, from the above we can tell that the second partition in the disk image "output" starts at sector 80325.
We also know that each sector is 512 bytes.
Multiply the two and we know that the second partition starts at byte number 41126400.
We can also tell the partition type is NTFS (It's a Windows XP partition...).
I created a directory called /mnt/C
to be the mount point for the NTFS partition, and plugged all the information in the right order for mount:
root@om:/mnt# mount -o loop,offset=41126400 -t ntfs output C
And voilà!
the second partition of the disk image file "output" is now mounted on my system under the /mnt/C
folder!
I proceed to grip all that teacher's years of work and yank it back from the jaws of oblivion. (For this, I just used graphical filesystem tools to copy out his Documents and Settings
folder)
Conclusion
There are certain situations where knowing how to work with raw disk data can be useful.
These are the things I had to learn to figure out how to do it right. Now I've got a reference to look back on the next time I have to do it.
Also, hopefully others will find it useful or at least mildly entertaining.
:-)