My motivation was to replace four separate real systems that I run 24/7 as firewall (P3), web server (U10), mail server (XP 2500+) and test system (XP 3200+) with a single low power box that can do it all, without compromising the security provided by four physically separate boxes. My power meter shows this will save abut £40/month in running costs.
As the install on the host, the bare metal install will be minimal, we will use a Gentoo hardened system, pared to the bone. There is no point in not using hardened.
There is no reason at all to run any extra services on the bare metal. Its sole purpose is to support virtual machines. Should you need another service, make another virtual machine.
This document will use host provided logical volumes and the virtio hard disk and network drivers. At the time of writing, these drivers provide near native performance without any known security issues.
As there are no live CDs that provide the virtio drivers, using them in guests is a two step process. Get the guest running on its own kernel in the conventional way, then swap drivers.
This document was written around an install on an AMD Athlon(tm) II Neo N36L Dual-Core Processor, with odds and ends tested on AMD Phenom(tm) II X6 1090T Processor.
Installs on Intel CPUs are similar.
As we shall use LVM for the VM storage, we may as well use it for the host too.
To move with the times we shall abandon fdisk for parted and MSDOS Partition Tables for GPT. Thats in anticipation of hard drives bigger than 2TiB
TIn setting up my own system, I drew heavily on The Fedora Virtualization Guide ... Fedora Project ... http://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/index.html and Setting Up Virtual Machines with KVM http://pacita.org/books/server-setup/output/pdf/doc.pdf
A modern 64 bit Intel or AMD processor with hardware support for virtualisation. Hardware support is not strictly necessary. Exactly what you need depends on what your load will be
This document assumes you are installing KVM on a purpose built remote box. Remote may only be a few feet away but it is intended that everything after the bare metal can boot for itself will be done over ssh, or using Virtual Machine Manager.
As its normal to set up VM on a server, the use of kernel raid, and root on lvm over the raid will be described. The raid and lvm steps are optional for the host install.
lvm will be required for the VM storage pool even on single drive installs. Its perfectly possible to have VM storage in a file on the host but this is suboptimal and will not be described in this document.
Boot the live CD/DVD of your choice and use parted to partition all of your drives identically. The following partitions are required.
This allows some space for expansion in the host. LVM supports on-line resizing, so its possible to grow a partition without a reboot.
Before diving into parted to make disk labels (partition tables) and partitions Think about what is needed. This document uses an example of three drives. /boot is easy, it will be a three way raid1 set, so 32Mb from each drive is required. The 30G raid5 for the host install is not quite so straightforward. For a three drive raid5, each partition needs to be 15G. Keep that in mind as you use parted
parted /dev/sda mklabel gpt mkpart primary 0% 32M mkpart primary 32M 15G mkpart primary 15G 100% name 1 boot name 2 host name 3 virtual set 1 boot on quit
Repeat for /dev/sdb and /dev/scc
boot will be raid1, the other two will be raid5. This gives us root on raid5 and lvm, which compels the use of an initrd. Swap will also be on a logical volume.
mdadm --create /dev/md0 --metadata=0.90 --level=1 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1 mdadm --create /dev/md1 --level 5 --raid-devices=3 /dev/sda2 /dev/sdb2 /dev/sdc2 mdadm --create /dev/md2 --level 5 --raid-devices=3 /dev/sda3 /dev/sdb3 /dev/sdc3
Donate the two raid5 sets to two lvm physical volumes. It is not essential to have separate volumes for the host and VMs but it avoids accidentally deleting a part of the host file system when you intended to delete a VM.
vgcreate host /dev/md1 vgcreate vm /dev/md2
With the volume groups created, they can be subdivided into logical volumes, which we can finally format as any other block devic and use for our install.
lvcreate --size 512M --name root host lvcreate --size 4G --name var host lvcreate --size 4G --name usr host lvcreate --size 1G --name tmp host lvcreate --size 512M --name portage host lvcreate --size 2G --name distfiles host lvcreate --size 2G --name packages host lvcreate --size 8G --name swap host
This leaves about 8G of unallocated space in the host volume group for expansion at a later date.
Check your /dev/mapper. It should contain eight logical volumes of the format host-...
Readers of a nervous disposition may wonder about the use of so many partitions for the host. It allows for the efficient use of disk space. Only root and swap are really needed. (/boot os our /dev/md0)
We could use ext2 on tmp, portage, distfiles and packages as they contain things that are easily replaced. However, ext4 has an option to not create a Journal, so we will use ext4, so we will use that everywhere. The use of the dir_index option is really some gentle ricing on the host install but may come into its own later.
mkswap /dev/mapper/host-swap mke4fs -O ^has_journal,dir_index /dev/md0 mke4fs -O dir_index /dev/mapper/host-root mke4fs -O dir_index /dev/mapper/host-var mke4fs -O dir_index /dev/mapper/host-usr mke4fs -O ^has_journal,dir_index /dev/mapper/host-tmp mke4fs -O ^has_journal,dir_index -b 1024 -i 1024 /dev/mapper/host-portage mke4fs -O ^has_journal,dir_index /dev/mapper/host-distfiles mke4fs -O ^has_journal,dir_index /dev/mapper/host-packages
Mount the partitions, making the required directories as we go. Its not quate as simple as the three partition layout used by the Gentoo handbook.
swapon /dev/mapper/host-swap mount /dev/mapper/host-root /mnt/gentoo mkdir /mnt/gentoo/boot mkdir /mnt/gentoo/tmp mkdir /mnt/gentoo/usr mkdir /mnt/gentoo/var mount /dev/md0 /mnt/gentoo/boot mount /dev/mapper/host-tmp /mnt/gentoo/tmp mount /dev/mapper/host-usr /mnt/gentoo/usr mount /dev/mapper/host-var /mnt/gentoo/var mkdir /mnt/gentoo/usr/portage mount /dev/mapper/host-portage /mnt/gentoo/usr/portage mkdir /mnt/gentoo/usr/portage/distfiles mkdir /mnt/gentoo/usr/portage/packages mount /dev/mapper/host-distfiles /mnt/gentoo/usr/portage/distfiles mount /dev/mapper/host-packages /mnt/gentoo/usr/portage/packages
Other mount points, like /dev/ and /proc will be created by the stage3. Fetch and install the hardened stage3 in the normal manner.
Kernel Raid and Logical Volume Manager are both extra layers of software between the hardware and the applications. Raid provides redundancy, if a disk fails, your system will beep running. Logical Volume Manager provides flexibility. Logical volumes can be grown and shrunk to move free space around as needed, provided you choose a filesystem that supports resizing.
Swap should be the same size as your RAM as this allows RAM in the VM to be overcommitted. VMs are just processes to the host. When you run of of RAM parts of VMs can be swapped. The alternative is to having too little swap is that the kernel Out Of Memory manager will kick in and maybe kill a VM, which is just like a power fail, only faster.
With the drives partitioned filesystems made and mounted its time to do a normal Gentoo install by otllowing the handbook, with a few minor exceptions.
If you run emerge --info now you will see that it is full of things you will never need on a virtual machine host system. Most of the USE_EXPAND features can be set to the null string to get rid of the clutter. This has no effect on the install but it makes emerge --info easier to interpret. Add the following to /mount/gentoo/etc/make.conf
# Unset the following USE_Expands ALSA_CARDS="" ALSA_PCM_PLUGINS="" APACHE2_MODULES="" CALLIGRA_FEATURES="" CAMERAS="" COLLECTD_PLUGINS="" GPSD_PROTOCOLS="" INPUT_DEVICES="" LCD_DEVICES="" SANE_BACKENDS="" VIDEO_CARDS=""
Remove some use flags we do not want by adding the following to the USE= in /mnt/gentoo/etc/make.conf
-X -cups -dri -gnome -kde
Add buildpkg to FEATURES in make.conf. This saves a tarball of every package that is built to /usr/portage/distfiles in a format that can be used by emerge. We shall use most of these packages later.
FEATURES="buildpkg"If you want to use distcc, you must not use -march=native in CFLAGS unless the helper(s) have identical CPUs. If you don't know what distcc is, you won't be using it.
Make the directory /mnt/gentoo/etc/portage and add the file package.use with the following contents
# for initrd use, these packages must be statically linked sys-fs/lvm2 static sys-fs/mdadm static sys-apps/busybox static # for virtual machine support app-emulation/qemu-kvm sdl threads vde vhost-net # for libvirt with parted support so we can use lvm storage pools for VM sys-block/parted device-mapper # to get consoles in an X window but we don't want an X server media-libs/libsdl X app-emulation/libvirt qemu virt-network numa lvm parted pcap phyp udev # unset libvirt USE flags # -avahi -caps -debug -iscsi -macvtap -nfs -numa -openvz -sasl (-selinux) -uml -virtualbox -xen
The settings given here are in addition to your normal hardware support. Should you need option buy option support to build a kernel, kernel-seeds.org is recommended.
An initrd is just a root file system in a file which is loaded by the boot loader and left where the kernel can find it at /dev/ram0. It needs some /dev nodes, so it can operate on devices, some programs to run and script to tell the kernel what to do. There are several tools to make initrd files but its easy to do it manually too.
Initrds can do anything the system can do but ours will do the bare minimam to get us booted. It will not load kernel modules or anything fancy. All the drivers you need to boot will still need to be compiled into the kernel.
When the initrd is actually in use, its just the kernel and the initrd in memory, there are no libraries and no dynamic linkers, so all the programs must be statically linked. The size of the initrd files does not really matter as the RAM it occupies is freed as soon as its done its job.
Start by making some space to assemble the initrd. The entire content of this file will be made into a file as the last step in the process.
mkdir /root/initrd
Make directories in /root/initrd. These directories have exactly the same uses as their counterparts on the real root file system, except that /sbin and /bin have been combined into bin.
cd /root/initrd/ mkdir bin dev etc newroot proc sys
Now to populate these directories. bin needs three statically linked programs, busybox, lvm and mdadm. busybox is our shell, lvm manipulates logical volumes and mdadm manipulates raid devices.
cd /root/initrd/bin cp /bin/busybox /sbin/mdadm /sbin/lvm.static ./ mv lvm.static lvm ln -s busybox cat ln -s busybox mount ln -s busybox sh ln -s busybox sleep ln -s busybox switch_root ln -s busybox umount ln -s lvm vgchange ln -s lvm vgscan
dev needs to contain console, null and all the partions donated to raid sets, plus some directories
cd /root/initrd/dev cp -a /dev/null /dev/console /dev/sda1 /dev/sda2 /dev/sda3 /dev/sdb1 /dev/sdb2 /dev/sdb3 /dev/sdc1 /dev/sdc2 /dev/sdc3 ./ mkdir mapper vc
/root/initrd/dev/mapper is empty, /root/initrd/dev/vc contains a relative symlink called 0 (zero) to /dev/console
cd /root/initrd/dev/vc ln -s ../console 0
etc, newroot, proc and sys are intentionally empty. We still need the init script. Use nano to copy the script below
#!/bin/sh rescue_shell() { echo "Something went wrong. Dropping you to a shell." busybox --install -s exec /bin/sh } mount -t proc none /proc CMDLINE=`cat /proc/cmdline` mount -t sysfs none /sys #wait a little to avoid trailing kernel output sleep 3 #If you don't have a qwerty keyboard, uncomment the next line #loadkmap < /etc/kmap-fr # raid - we dont really need to assemble boot but we are no longer using autodetect /bin/mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 || rescue_shell # must assemble md1 as root is on lvm there and its ver 1.2 metadata /bin/mdadm --assemble /dev/md1 /dev/sda2 /dev/sdb2 /dev/sdc2 || rescue_shell # may as well assemble the virtual machine space too /bin/mdadm --assemble /dev/md2 /dev/sda3 /dev/sdb3 /dev/sdc3 || rescue_shell #If you have a msg, show it: #cat /etc/msg #lvm #/bin/vgscan # start the host lvm /bin/vgchange -ay host || rescue_shell # start the VM lvm /bin/vgchange -ay vm || rescue_shell #root filesystem mount -r /dev/mapper/host-root /newroot || rescue_shell #unmount pseudo FS umount /sys umount /proc #root switch exec /bin/busybox switch_root /newroot /sbin/init ${CMDLINE}
As the init script will be run, it must be executable
chmod +x /root/initrd/init
Thats all the prep work done, now to assembe everything to a file in /boo t
find . | cpio --quiet -o -H newc | gzip -9 > /boot/initramfs
Reboot into your new hardened host install.
Update portage
emerge --sync
Rebuild the toolchain
cd /usr/portage scripts/bootstrap.sh
Select the hardened compiler.
gcc-config -l gcc-config 2
Update and rebuild system
Upate and rebuild world
Add the packages needed to manage VMs
Clear out any rubbish emerge --depclean -p eclean -d distfiles eclean -d packages
Now we have a nice clean lean host install which can be leaveraged for a template VM install