spaceblog

codefest, project overload

At K’s codefest yesterday I spent most of the time reading some papers on using linear algebra to approximate spring embedders for three dimensional graph layout. Lots of pretty pictures in graph drawing papers, alas the eigenvectors melted my brain. Today I tried to find the old uni linear algebra textbook but it eluded me.

Instead I hacked on a chroot builder for Red Hat 8.0 and 9 that I’d started on Friday – I didn’t work on it yesterday at the codefest due to the interesting quirk that Red Hat’s i386 packages don’t really like installing on a powerpc architecture, which I suppose could have been solved with the –ignorearch option but I didn’t bother to delve. Anyway, back at home the trusty workstation built minimal Red Hat roots many times during the day. This is just stage one in my master plan.

It sucks to have three or four projects to be working on at once: I find I never do any of them… or they block on a hard problem and end up stagnating as I procrastinate and do something else. Needless to say the garden is looking cleaner and the dishes get washed regularly.

databases, elisp, vim major mode

Well, last week sucked a lot. An upside of being overworked is that I know more about database performance tuning and OS scheduling than I did before, and honed some profiling skills. All this whilst working hard and fast in a low morale environment! I must be made of steel or something.

Anyway, I’ve now got some more emacs major modes: in addition to zonefile.el, there’s graphviz.el for syntax highlighting of graphviz style dot files and filtergen.el for highlighting the filtergen rules syntax. These, again, are in my elisp--main--0 arch repository.

James had a good idea last week: a major mode that reads the vim syntax files, thus giving emacs users access to all the exsiting vim highlighting. So evil it just might work.

zonefile.el

The other day I was adding some entries to our DNS, using the power of emacs tramp, and it occurred to me that the zone file wasn’t syntax highlighted like vim does it, and unlike when using the wrapper scripts Eman wrote, I had to manually increment the serial number and reload the nameserver. That’s so much effort.

So just now I whipped up a zonefile.el that so far does the syntax highlighting, but it’s a good start. It’s in my arch repo jaq@spacepants.org--2004 under the imaginatively titled branch elisp--main--0.

Anchor's RAID Migration howto

(reposted from http://lists.slug.org.au/archives/slug/2004/05/msg00374.html)

So here’s Anchor’s RAID migration document, tweaked for Debian unstable (previously it’d only been done on various flavours of Red Hat), based on the migration I did last night (and I expect I’ll be doing it again today on a different machine).

I’m going to skip the parts that migrate an existing RAID configuration to bigger disks or an alternate RAID scheme, and instead assume you’ve got two disks in a box and you performed the install onto the first disk (hda), thus leaving the second disk (hdb) unused.

  • Partition hdb how you like it. My server partitioning scheme is like this for IDE machines:
hda1 500M /
hda2 2G swap
hda3 Extended
hda5 2G /var
hda6 4G /usr
hda7 remaining /data

/data contains bind mount points for /home, /var/lib/{mysql,postgres}

  • Create a degraded RAID array.
mdadm -C -l 1 -n 2 /dev/md0 /dev/hdb1 missing
mdadm -C -l 1 -n 2 /dev/md1 /dev/hdb5 missing
mdadm -C -l 1 -n 2 /dev/md2 /dev/hdb6 missing
mdadm -C -l 1 -n 2 /dev/md3 /dev/hdb7 missing

-l specifies the raid level, in this case RAID-1, and -n specifies the number of devices that will be in the array, in our case 2. missing lets mdadm know that there is another device we haven’t specified yet, and the array will be built on one half.

If you’re doing this on RAID-5, then you’d use two disks and a missing.

  • Create the new filesystems:
mke2fs -j /dev/md0
...
  • Mount the new partitions.
mkdir /newroot
mount /dev/md0 /newroot
cd /newroot
mkdir usr var data
mount ...
mkdir data/var.lib.postgres
mkdir -p var/lib/postgres
mount -o bind data/var.lib.postgres var/lib/postgres
  • Copy the data from the existing root to the new root
cd /
for each mountpoint on the old root:
cp -ax $mountpoint/ /newroot/$mountpoint/
  • Now comes the fun part. Shutdown all services that are writing to the disk, using ps ax and netstat -lnp to find out who’s still alive.

  • Rsync the data that just got written

for each mountpoint:

rsync -avnx $mountpoint/ /newroot/$mountpoint/

double check that did what you thought, then remove the -n option

  • Pivot the kernel onto the new root:
mkdir /newroot/oldroot
cd /newroot
pivot_root . oldroot

If you’re on the console, you can exec chroot . /bin/sh <dev/console >dev/console 2>&1

otherwise if you’re playing tough-guy-migration via SSH, don’t do this, yet

mount -t proc proc /proc
mount -t devpts devpts /dev/pts
mount -t tmpfs tmpfs /tmp

and for 2.6 kernels

mount -t sysfs sysfs /sys
  • Restart init and SSH:
telinit u
/etc/init.d/ssh restart

Now if you’re in tough guy mode, ssh into the new machine.

fuser -vm /oldroot should show you a few kernel threads and your first ssh session. If the ssh restart was successful and you’re logged in, log out of the first ssh session.

  • Umount the old root:

see the processes holding up the umount

fuser -mv /oldroot

see the mounts holding up the umount cat /proc/mounts

umount the virtual filesystems from oldroot:

umount /oldroot/proc
umount /oldroot/dev/pts

and so on

Chances are /proc/mounts says you’ve got a /dev/root.old and a /dev2/root2, and you’ve got some kernel threads attached to /oldroot/initrd:

mount -o remount,ro /dev/root.old /oldroot/initrd
mount -o remount,ro /dev2/root2 /oldroot

umount -l /oldroot/initrd
umount /oldroot

The -l to umount is a recent feature that does a lazy umount… it removes the mount point from the mounted filesystems namespaces, so it’s effectively gone, but it’ll get properly umounted when all processes using it are finished. It’s a good idea to make it read-only first just so you don’t break anything.

  • Fix /etc/fstab and /etc/mtab

/etc/fstab has the old hda filesystems on it, so fix that up.

/etc/mtab has the old devices listed because the pivot_root doesn’t update it, so fix that up too. Cross check against /proc/mounts.

  • Update the boot loader.

Debian unstable uses grub, so somethign like this will install the first stage bootloader into both MBRs:

grub
grub> device (hd0) /dev/hda
grub> device (hd1) /dev/hdb
grub> root (hd0,0)
grub> setup (hd0)
grub> root (hd1,0)
grub> setup (hd1)

That tells grub that it’s hd0 is Linux’s hda, to use hda1 as the location for grubs files (/boot/grub, as /boot is on / in my case) and to install the MBR on /dev/hda. The second pass is to do the same thing on hdb, using /dev/hdb1 and hdb’s MBR.

double check it worked by looking for the string GRUB in the output of

dd if=/dev/hdX count=1 | strings

  • Fix the initrd

I always forget this part and end up booting off of half of the / array which has the effect of destroying the raid superblock, requiring /dev/md0 to be reconstructed afterwards. So the moral is DON’T FORGET THIS PART.

mkinitrd -k -o /boot/initrd.img.tmp

Look in the temporary directory that mkinitrd left its files in, /tmp/mkinitrd.*/initrd and make sure that the file ‘script’ contains a line that builds the /dev/md0 array, like this:

mdadm -A /devfs/md/0 -R -u ...

It’ll probably be building it using only /dev/hdb1 at this moment, that’s fine.

  • Fix mdadm.conf.

/etc/init.d/mdadm and /etc/init.d/mdadm-raid will automagically build the remaining arrays at boot time if you get this right, otherwise the fsck will bomb out because /dev/md1 and friends are corrupted (i.e. don’t really exist)

So, in /etc/mdadm/mdadm.conf:

DEVICE /dev/hda* /dev/hdb*
ARRAY /dev/md0 devices=/dev/hda1,/dev/hdb1
ARRAY /dev/md1 devices=/dev/hda5,/dev/hdb5
ARRAY /dev/md2 devices=/dev/hda6,/dev/hdb6
ARRAY /dev/md3 devices=/dev/hda7,/dev/hdb7

Make sure you remember the DEVICE line, otherwise it’ll still fail…

  • Reconstruct the RAID array from the now free hda
sfdisk -d /dev/hdb | sfdisk /dev/hda

That’ll copy the partition table from hdb, your good disk, to hda, the missing disk.

Hot add the partitions to the array:

mdadm -a /dev/md0 /dev/hda1
mdadm -a /dev/md1 /dev/hda5
mdadm -a /dev/md2 /dev/hda6
mdadm -a /dev/md3 /dev/hda7

ramp up the reconstruction speed:

echo 1000000000 > /proc/sys/dev/raid/speed_limit_max

watch the progress:

watch "cat /proc/mdstat"
  • Do the boot loader again, because it’s fun, and likely stuff has moved around on hda1.

Don’t do anything on hda until the raid reconstruction is finished.

At this point you can continue using the machine, in fact as early as the “umount oldroot” step you can restart all your services and the machine will be back online: that’s a downtime of only as long as it takes to do the final rsync before pivoting.

I’d recommend rebooting soon after though so yuo can make sure you got the bootloader and initrd part right. During booting you can get away with changing the kernel root= option to use /dev/hda1 if the raid array isn’t getting constructed in your initrd.img.tmp, and don’t delete or overwrite any of your initrds whilst you’re debugging, only once it boots without assistance should you overwrite the initrd.img that’s listed in the grub menu.lst.