Monday, June 23, 2008

Now a Married Man

Saturday afternoon I was joined in sacred matrimony with my soul mate.

It was on the most perfect blessings in my life; right up there with being born and the family I was born into.

That is all.

Wednesday, May 14, 2008

HOWTO - Work with Disk Images

Warning! Extra Geeky Content Ahead!

What is a disk image?

The way I'm using it, a disk image is a bit-by-bit copy of the information on a data storage device. For a broader exploration of the topic, check out the "Disk image" article over at the Wikipedia.

I'll talk about hard disk images, and CD images (the infamous .iso file).

Why would I care about disk images?

Maybe you wouldn't. Mostly, I think we want to interact with our disks through the standard filesystem tools to work with files.

There ARE several situations where working with disk images might be useful. Among them might be:

  • Downloading and burning the latest Ubuntu Live CD.
  • Copying a CD, to distribute or to have a backup copy just in case.
  • Accessing the contents of these CD images without burning them onto CD.
  • Duplicating smaller disks (e.g. SD cards, CompactFlash cards)
  • Saving a perfect data copy of a disk as a first step in sensitive situations involving data (e.g., data recovery or forensics work).
  • Keeping an archival copy of a disk (e.g. I have a disk image of the 120MB hard disk from my old 386 computer for nostalgia purposes; one of these days I'm going to figure out how to emulate it...).

Note that while technically possible, imaging a large number of computers this way is a very long process compared to tools that are specifically designed for that sort of work (such as partimage and the SystemImager suite).

What tools do we need?

In UNIX (and Linux, and BSD, OSX, etc.), we will use the dd utility to access data from the storage device directly (as opposed to via standard file manipulation, which is (thankfully) abstracted from direct access through concepts like partitioning and filesystems) in order to be able to read and write disk images.

dd, or a version of it, is available for every platform, I believe.

I'll also discuss using the UNIX mount utility to mount the disk images as if they were real disks. I believe Mac OSX has a similar functionality built right in, and I know there are all kinds of programs for mounting virtual drives in Windows.

Creating a Disk Image

You'll need the disk you're copying from to not be mounted. You'll also need a place to save the image file that has sufficient free space. This may mean:

  • You're copying from a smaller disk to a larger disk—with more free space than the smaller disk has total capacity— neither of which is mounted on your computer.
  • You're booted into a Live CD environment in order to have un-mounted access to your main (operating system) disk in order to be able to copy it off onto an external storage device, be that over USB or over the network.

Whatever the case, once you have things ready, the syntax we use is:

# dd if=input of=output

Where input is the disk device node and output is the file you want to write the disk's data to.

SO, if I want to take /dev/sda and make a disk image of it called sda.dd in the current working directory, I run:

# dd if=/dev/sda of=sda.dd

And now I have a file named sda.dd which contains an exact bit-by-bit copy of my /dev/sda disk!

Writing a Disk Image to a Disk

So, you created a disk image of your drive, and then you did something stupid and ruined the contents of your drive? or your drive died and you got a new one? or (more optimistically) you're just duplicating the hard disk and that's why you have an image? No worries, we can simply write the image right back onto that disk!

We'll need to have access to the disk image file, and the destination disk will need to be available and unmounted.

The pattern for dd stays the same:

# dd if=input of=output

Except now input is the disk image file and and output is the disk device node you want to write the disk's data to.

SO, if I want to take /dev/sda and write a disk image called sda.dd onto it, I run:

# dd if=sda.dd of=/dev/sda

Partitions

Incidentally, you can also do this with a partition rather than a full disk, by giving it a partition node instead of a disk node (e.g. /dev/sda1 is the first partition on /dev/sda) so dd will create an image of just the first partition rather than one of the full disk).

CD Images

Compact Disc images have been made really easy to work with.

In Ubuntu, you need only to right-click on a .iso file in order to be presented with the option to Write to Disc.... Also, using Brasero, you can run a "Disc copy" project and copy your disc to a "File image".

Alternatively, on the command-line, CD images work just like you might expect from the above. In order to create them you can just:

# dd if=/dev/scd0 of=discimage.iso

And in order to burn the disc image to a blank disc, you can just reverse the direction:

# dd if=discimage.iso of=/dev/scd0

Accessing Disk Image Partitions

So, we can write the disk images onto disks...

We can also access and manipulate them directly by treating the image file as a disk. I'll be discussing the use of the UNIX mount utility, which is responsible for mounting disks onto the UNIX file system, to do this.

The pattern for using mount (for this) is:

# mount -t fstype -o options device directory

Where:

  • fstype is the type of filesystem you're trying to mount; necessary if you're working with non-native filesystems, like NTFS.
  • options are extra options you may need to use, e.g. loop is the option we feed mount to let it know we're feeding it a disk image file rather than a real disk. If we want to mount a partition from inside a full disk image, we'll also need to use the offset option to let it know where the partition starts.
  • device is the device (in our case disk image) to be mounted.
  • directory is the destination directory in the UNIX file system where you want the image mounted.

So, in order to mount a CD iso, we just do:

# mount -o loop cdimage.iso /media/iso

And now running ls /media/iso will show us the root directory of the CD.

On Ubuntu 8.04 (and others), so long as you mount it in the /media directory, an icon will show up on the GNOME desktop to represent the "disc".

If we have the image of a disk partition, we can similarly mount it as above with:

# mount -o loop partition.img /media/partition

However, if we want to mount a partition from inside a full disk image, we'll first need to locate the spot in the image that the partition begins.

We need to use fdisk to get the needed information out of the disk image, with the -l option to list partition information and the -u option to show us the sizes in sectors.

For example, as I was working on recovering data off of a teacher's dying hard drives, I got this response:

root@om:/mnt# fdisk -lu output 
You must set cylinders.
You can do this from the extra functions menu.

Disk output: 0 MB, 0 bytes
255 heads, 63 sectors/track, 0 cylinders, total 0 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x9dc96e9e

 Device Boot      Start         End      Blocks   Id  System
output1              63       80324       40131   de  Dell Utility
output2   *       80325   464840774   232380225    7  HPFS/NTFS
Partition 2 has different physical/logical endings:
     phys=(1023, 254, 63) logical=(28934, 254, 63)

So, from the above we can tell that the second partition in the disk image "output" starts at sector 80325.

We also know that each sector is 512 bytes.

Multiply the two and we know that the second partition starts at byte number 41126400.

We can also tell the partition type is NTFS (It's a Windows XP partition...).

I created a directory called /mnt/C to be the mount point for the NTFS partition, and plugged all the information in the right order for mount:

root@om:/mnt# mount -o loop,offset=41126400 -t ntfs output C

And voilà!

the second partition of the disk image file "output" is now mounted on my system under the /mnt/C folder!

I proceed to grip all that teacher's years of work and yank it back from the jaws of oblivion. (For this, I just used graphical filesystem tools to copy out his Documents and Settings folder)

Conclusion

There are certain situations where knowing how to work with raw disk data can be useful.

These are the things I had to learn to figure out how to do it right. Now I've got a reference to look back on the next time I have to do it.

Also, hopefully others will find it useful or at least mildly entertaining.

:-)

Thursday, May 01, 2008

HOWTO - Reconstruct failed RAID 0 arrays for fun and profit.

Hey, I had an experience I couldn't resist blogging about. :-D

WARNING! Ultra-Geeky Story Ahead! No, Really! You've Been Warned!

So, this teacher brought in his computer from home. It was a high-end Dell XPS, about 5 years old. He had ordered it with two 120GB hard disks striped in RAID 0, and recently he went to boot it up (into Windows XP) and it would not work; some very important system file was corrupt. He went round and round with Dell support before bringing it in to us to see what we could do.

At first glance, the situation was grim. He'd run diagnostics and knew there was a physically bad sector on the second disk of the array. Popping in the XP install disk and telling it to "R"ecover brought no luck, nor did booting up to a Bart PE disk; neither knew there was a disk there. Windows basically wanted nothing to do with the failed array.

So, in went the Ubuntu Live CD. It saw a disk, and I thought, "Great! We can at least get the data off the array, and get SOMETHING out of it. If nothing else, there's dd and photorec..." Except, after trying to mount the Dell utility partition, listing its contents gave us crazy garbage.

"Oh, well," I thought, "dd it is." So I ran dd if=/dev/sda of=/mnt/dev.sda (having NFS mounted a network drive on /mnt) and left for the day (it takes a while...).

The next morning, I saw the file's size was 120GBs...which is HALF of what I expected to see. Only THEN did I notice that Ubuntu was also seeing a /dev/sdb drive.

See, the two disks were plugged into a "RAID controller" card that we had all assumed made this "Hardware RAID" so that the OS didn't have to know anything to use it. Well, so apparently that "RAID controller" card relied on some software component somewhere, somehow under Windows to function right, and Ubuntu did not know it was supposed to be RAID.

Vern suggested we throw the disks into a software RAID utility to see what came out. Unfortunately, nothing.

So we felt pretty much defeated, since even if we got all the data off both disks, it'd be scrambled and useless. Still, for some reason, I couldn't quit thinking about this problem. It seemed like there should be SOMETHING we could do with the data from the disks to recombine them into something useful.

I needled Vern with questions about how RAID 0 works, and looked it up online myself. IF RAID 0 worked as simply as I figured it SHOULD, I should be able to figure out how to solve this problem. I had a fuzzy idea to work from, no experience with anything like thing, and I could not say I'd succeed. However, I figured, in order to save several years worth of this teacher's work, it was worth going for it and giving it my best shot.

I would write a Python program to "de-interlace" the two disks into one image. I had already written a program that worked with the sorts of tools I would need to use (though it was written with the opposite problem in mind) during my trek through the Python Challenge, so I looked it up, modified it, and came up with a simple program that would:

  • Open two input files to read from (in binary mode).
  • Open a output file to write to (in binary mode).
  • Take a certain amount of data from the first file and write it to the output file, then take a certain amount of data from the second file and write it to the output file.
  • Wash, rinse, repeat until there's no more to read from either input file.
  • Close out all files.

All I needed to know was, what is "a certain amount" supposed to be? I could only find offers to sell me commercial solutions and vague descriptions of RAID theory, no concrete implementation details, through Google. I decided it can't be THAT hard to figure out, right? ;-).

Well, being a computer, we figured it probably had to be a power of 2. I grabbed the first 100,000 sectors of each drive's image to play with (drive 2, the failed one, STILL has dd grinding away at it as I type; it's taking at least around an order of magnitude longer to recover that data than from the first, functional, disk). I tried guessing a few times, 512, 4096, etc. to no avail. I would go to mount the Dell utility partition and ls's output came back scrambled.

I decided to instead pull out a hex editor and take a look at the raw data, to see if I could find any patterns. I found all KINDS of patterns :-), none of which seemed to pan out when it came to testing them, until I found this strange sequence of bytes, a pattern at the start of /dev/sdb:

00000000  01 40 02 40 03 40 04 40  05 40 06 40 07 40 08 40  |.@.@.@.@.@.@.@.@|
00000010  09 40 0a 40 0b 40 0c 40  0d 40 0e 40 0f 40 10 40  |.@.@.@.@.@.@.@.@|
00000020  11 40 12 40 13 40 14 40  15 40 16 40 17 40 18 40  |.@.@.@.@.@.@.@.@|
00000030  19 40 1a 40 1b 40 1c 40  1d 40 1e 40 1f 40 20 40  |.@.@.@.@.@.@.@ @|
00000040  21 40 22 40 23 40 24 40  25 40 26 40 27 40 28 40  |!@"@#@$@%@&@'@(@|
...[etc.]...

Then, later at some point, I noticed this strange sequence in /dev/sda:

...[etc.]...
0000ffd0  e9 3f ea 3f eb 3f ec 3f  ed 3f ee 3f ef 3f f0 3f  |.?.?.?.?.?.?.?.?|
0000ffe0  f1 3f f2 3f f3 3f f4 3f  f5 3f f6 3f f7 3f f8 3f  |.?.?.?.?.?.?.?.?|
0000fff0  f9 3f fa 3f fb 3f fc 3f  fd 3f fe 3f ff 3f 00 40  |.?.?.?.?.?.?.?.@|
00010000  91 89 76 7c 8c 5e 7e 66  c7 86 80 00 ff ff ff ff  |..v|.^~f........|
00010010  8b d6 03 56 0b 89 56 78  8c 5e 7a 1e b8 70 00 8e  |...V..Vx.^z..p..|
00010020  d8 8e c0 33 ff f3 a4 1f  89 7e 72 8c 46 74 81 c7  |...3.....~r.Ft..|
...[etc.]...

A strange break in a strange pattern at precisely a power of 2! On further investigation, the strange pattern would continue flawlessly if the start of /dev/sdb continued it. So I took ffff, which is 65535 in decimal, and plugged that into my script. This time the ls looked much more promising and still weird, with filenames like sdos s.ys', onfig s.ys and utoexecb.at.

D'oh! I was one byte off! So I add a byte and Bam! the Dell utility partition worked precisely as it should.

So, strictly speaking, I don't know if we'll succeed in getting the data off the array. We'll have to wait a few days to find out how the rest of the second hard drive fares, as it's taking an inordinately long amount of time to get that data off. However, whatever happens from here, my theory payed off!

That's a beautiful thing for a geek like me. :-D

Oh, and it's cool that we might be able to save all/some of that data, too. ;-)

Source

#!/usr/bin/python
# -*- coding: utf-8 -*-
#
# deinterlace.py
#
# INTENT = This is a script for deinterlacing two raw dd images
#     taken from a failed RAID 0 array into one "valid" image file
#     that we hope to be able to recover data from.
#
#          This is strictly experimental.
#
#                               Thursday, May 1, 2008 -Simón A. Ruiz
#

inputFiles = [open("dev.sda","rb"),open("dev.sdb","rb")]
outputFile = open("output","wb")
chunkSize = 65536

# And, so as not to have to figure this out every time through the loop...
numFiles = len(inputFiles)

i = 0
while True:
    nextChunk = inputFiles[i%numFiles].read(chunkSize)
    if not nextChunk:
        print 'Done! No more data.'
        break
    outputFile.write(nextChunk)
    i += 1

outputFile.close()
for file in inputFiles:
    file.close()

Stay tuned for a post on mounting disk images as if they were real disks.

[My next post talks about disk images in depth]

Thursday, February 21, 2008

Creative Commons announces "Approved for Free Cultural Works" seal.

Via GNUosphere.

Seal: APPROVED FOR Free Cultural Works

Interestingly, I was just having this very discussion on a mailing list.

A license of CC-BY-NC-SA was proposed for a work that the whole community was going to participate in.

I proposed CC-BY-SA as being more in line with our principles (we're a community based on Free Software), and posited that since we enjoy the freedoms granted by the GPL, we should pay it forward by granting the same freedoms in our works.

"It'll protect the work", they said, "from being used in an inappropriate context. We want to make sure no one has to pay money for the fruits of our labor, and anyways it's a CC license; it's free!"

I argued along the definition of freedom, based on the GPL, proposed by Benjamin Mako Hill, that free (libre) does not include works that are restricted from being used non-commercially.

I pointed out that, as a commercial clause, NC is only benefits to a copyright holder would be in the commercial sphere. Like, say a company puts out some training material as CC-BY-NC-SA. This allows them to get some of the benefits of CC licensing (publicity, mostly) while still maintaining a monopoly on the commercial use of the materials.

(Canonical recently did this, announcing an experiment to license their commercial work CC-BY-NC-SA and that they intend to use the NC clause to make sure that anyone who uses their materials commercially (seriously huge amounts of material) has to "pay" for that privilege by giving back to the community.)

I can understand the use of a NC clause for a commercial company licensing a commercial product they would otherwise be copyrighted, I said, but a community that doesn't plan on maintaining a commercial monopoly receives no benefit from an NC clause. They're merely restricting freedom to no real purpose.

I believe the Attribution and ShareAlike clauses are protection enough to keep people from using a work inappropriately. GNU, Linux and Ubuntu are excellent examples.

Interestingly, I managed to sway the original proponents and supporters of the NC clause to my way of thinking, but by then they'd already convinced enough other people, who didn't really understand my points, that a vote came to a tie and the matter was left unresolved.

I think it's really cool, and absolutely appropriate, that the CC foundation is taking this step. Hopefully, it will help educate people about what freedom means.

P.S. If you've gotten this far in my post, YOU WILL CARE ABOUT THIS: Lawrence Lessig is considering running for Congress under his new Change Congress campaign. Do yourself a favor and see what he has to say.

P.S. Lessig started the whole Creative Commons movement, for those who don't know.

More worthwhile video at his site, including his final lecture on "Free Culture" given at Stanford. Like I said, do yourself the favor of watching it.

P.P.S. I hope to start posting more frequently. We'll see.

Thursday, September 27, 2007

Take me out to the LinuxFest! (OLFU Eve)

We just arrived, the ride went smoothly, though I decided to take a scenic route part of the way in defiance of Google.

We walked around and situated ourselves. I saw the room where the scheduled event for tonight (7-10 tonight, we got here at 10:30) had taken place, and there was lots of pizza out and about the big room, but when I peeked my head in I recognized Beth Lynn in a small group of four people talking intensely and the rest of the room was deserted; I figured it'd probably be best NOT to bother our event organizers, so I didn't.

Got back to my room and went to plug in my laptop to write a little bit only to find that *gasp* I didn't bring my laptop's power supply (idiot!).

I'll either need to use my fiancée's laptop for class tomorrow, or maybe someone there will have an HP/Compaq laptop power supply that I can mooch off of for the duration.

Whoops!

It's sleep time...

Powered by ScribeFire.

Sunday, September 16, 2007

Beginning of the School Year

Boy, summer was interesting, and the school year is underfoot.

Summertime...and the livin's easy...

Let me share a little bit about my summer.

First of all, I got to spend a good month worth of vacation. For the first week, my fiancée and I visited my family out in Kansas. I got to spend some good time with my grandmother, who I hadn't seen in a while, and who passed away around a month later. I must say that I'm grateful that I got to spend that time with her; I think it may have been much sadder for me if I hadn't.

We spent the rest of our vacation visiting family in Venezuela, my beloved fatherland. My fiancée had been there before with me, last year for about a week, but this was the first time that she'd been there for any sort of extended period of time where it wasn't rush, rush, rush. Unfortunately she did become sick when we ate somewhere we really shouldn't've, but otherwise we had a beautiful time there.

Back to work

The day I got back from vacation, we had to move the Technology department's office (desks, supply storage, servers, etc.) into it's new location (did I mentioned I started work here immediately before a nine million dollar renovation project?). This wasn't SO bad, except for the power got turned on about halfway through that day, the network drops weren't done until few days, and it took a few weeks to get our cabinetry in.

Oh, and the water...yes, the water. It turns out that whoever is responsible for scheduling the construction project thought "Let's hope it doesn't rain!" is a valid plan when it comes to ensuring the most expensive and critical technolog equipment in the school is safe.

Yes, they moved us into a part of the building where the roof was being removed to make way for the new addition on the second story. For the next four weeks, it consistently rained once or twice a week, and all that water ended up in our office.

It turns out Dell servers are pretty well designed to withstanding falling water on them, though I somehow doubt that was deliberate as, who in their right mind would put servers in any danger of being rained on? Needless to say, we became intimately acquainted with the joys of tarping everything every night before we left work.

As a result of that situation, as well as running around doing disaster recovery type stuff—like, say, finding a hundred-foot extension cord to get power into the network closet, since someone had sawed through a main power line—we became sort of swamped between the work we had been hoping to do over the summer, the work that needed to be done with picking up the pieces from the ongoing renovation, and the normal beginning-of-the-school-year work.

I'd like to get more HOWTOs posts up here—I'd love to—but we've been doing a lot of work to sort of maintain our technology situation, so I haven't gotten too much of an opportunity to sit down and improve it to have any good writing material.

Teaching

<My classes are going well, co-teaching with my boss Vern. We have a Python class and a Java class, and I've got to say that so far I prefer Python. I've played with Python before, but I always quickly forgot it because I didn't use it; this class is a concrete project to use it and get it learned, and the more I learn about it, the more projects I should be able to use it for.

The Indiana LoCo Team & Ohio LinuxFest

The Ubuntu Indiana Local Community Team has become a bit more active after the summer break, and we're looking to meet up at the Ohio LinuxFest this year. If you're into Ubuntu and you're in Indiana, I encourage you to get involved in the team, and if you're anyone reading this I encourage you to go to the Ohio LinuxFest—if you fall into both camps, come join our gathering during the lunch break, we'll be the big group of people with matching T-shirts having a great time. :-)

Also related to the Ohio LinuxFest, this year they're having an Ohio LinuxFest University day of classes the Friday before. It's a little bit more expensive than just going to the LinuxFest itself, but I'm hoping it's worth every penny—I'm going to take the Linux Professional Institute Level One Cram Session class, and take the LPI 101 exam on the Sunday after.

I've been warned that it's quite the difficult exam, so I look forward to the challenge. I'm going through O'Reilly's LPI Linux Certification in a Nutshell book, and reading man pages and playing with stuff (including a Fedora 7 virtual machine, as "Use Red Hat Package Manager" is one of the most important things tested) to see if I can pass it this time.

Note, I will get this certification, the only thing in question is whether I pass the 101 exam on my first try. The test is designed for people with "two years of experience" with Linux system administration. I've got about half of that, maybe, but I've also got an interest in it, and determination. We'll see how it works out, either way I'll learn a lot about the tools at my command. :-)

Until next time

I hope to post more frequently here. The bar has been set dramatically low since I moved here, so that shouldn't be much of a challenge. I'll keep a neuron out for some interesting post fodder as I go about my work, maybe soon we'll get to the point where I can work on some good system improving—that often makes for some interesting stuff to post (interesting to me at least).

Until then, take care!

Powered by ScribeFire.

Sunday, May 20, 2007

First from Fort Wayne - He's Alive!

I've been left with very little time to work on personal stuff since the move, but I'm fighting to make some time. It's all a bit overwhelming, but I'm having an awesome time at my new job. I learn about 50 new things a day as I get a feeling for the new system and the culture of the school.

System-wise, I'm now responsible for everything...that means I can no longer tell people "Well, that sucks, I'm going to have to call the tech. She'll be in tomorrow, hopefully, or the day after" or "Yeah, that's a server thing and we really have no control over that" since the buck stops with me. I'll have to take complete ownership of everything over the next 5 weeks or so as I gradually learn the ropes from my predecessor (I'm incredibly glad I decided not to try to wait until the end of the year to show up and learn the ropes...) and take them over bit by bit.

I've learned alot from Jeremy and Vern already, and I'd like to think I've taught them a little, too. I can see a few things in the current system that I'd like to work on changing, but my number one priority right now is learning the system as it is.

I'm going to have to refamiliarize myself with Python and do more than scratch the surface this time, since I'm going to be teaching a Python class during the Fall semester of next school year. Ditto with C during the Spring semester (though I actually have no previous experience with C). It's going to be an interesting challenge, and I'm up for it.

Culture-wise, there are quite a few changes to get used to. For example, I've got to wear a shirt and tie to work for the very first time ever. I did not have a very suitable wardrobe when I moved up here and I actually made the investment of going to a "Big & Tall" store so I could find nice clothing that I could stand to be in for that long every day.

I'm no longer Simón to the kids, I'm now Mr. Ruiz.

I've not done much in the way of expanding my social life outside the context of work besides going to the local Linux Users' Group activities. I presented this past Thursday for them showing off the new Feisty Fawn installed on an HP Tablet PC (tc4200, if you're curious), and we've been talking about trying to host a LinuxFest here in Fort Wayne.

I really need to put in my presentation proposal for the Ohio LinuxFest already. They sent out their call for presenters before I left Bloomington...

I and my new boss, Vern, have been discussing putting together a full-day workshop for next year's HECC conference (this year I helped out Mike Huffman and Forrest Gaston with their Indiana ACCESS workshop). It also seems like work might pay to send me to NECC this year, which sounds pretty cool.

Anyhow, just wanted you all to know I'm still alive. I'm going to try to post more often (yes, I know, that's not too hard). And yes, I know I need to change my blog's title RSN.

Powered by ScribeFire.