Starting and stopping the system
Network booting
Network booting is not a new idea. It was the original reason for Sun's Network File System, which we looked at in "Chapters 24" and "25" . Nowadays people normally use NFS for additional shared file systems; in the case of net booting, you mount your own private NFS file system as your root file system. Clearly, the first thing you need to do is to create this file system.
Next, you need to find a way to boot the system. There are a few possibilities here:
- You can boot a minimal system from floppy disk or CD-ROM and use this to mount the file systems remotely. This is different from running the system from floppy or CD-ROM: in this case, the disk device serves effectively as a bootstrap, and the operating system is located elsewhere.
- You can create a boot PROM for your network card and use that to boot.
- You can use PXE if your card supports it.
Whichever method you use, you need to set up a network interface very early. In "Chapter 17" we saw that the network setup is part of the system initialization, and that the configuration is stored in /etc/rc.conf. For a network boot, the network must be running before the kernel can be loaded, so that method won't work here. Instead, we use DHCP, which we looked at on page 302. We could also use the bootpd daemon, but it's more limited, so it's better to use DHCP.
If you use floppy or CD-ROM, you could theoretically load the bootstrap from that device. This isn't the same as the alternative we'll see on page 549, where we load the kernel from floppy or CD: here we only load the bootstrap and then load the kernel from the network. This minor difference has significant implications on the ease of system administration.
The next step is to actually transfer the data. We do this with TFTP, the Trivial File Transfer Protocol. As the name suggests, TFTP is a relatively simple replacement for FTP. In particular, it knows almost nothing about security. If you use TFTP, make sure that it can't be accessed from outside your network, for example by using a firewall. The default firewall rules block TFTP.
In the following sections we'll look at the example of setting up bumble.example.org as a diskless machine.
Setting up the file system
There are a number of ways to put the files on the NFS server:
- You might copy the files in the root and /usr file systems of the server machine.
- You could install FreeBSD on a separate disk and NFS mount it where the remote system can access it. By itself, this doesn't have much of an advantage over having a local disk on the machine, but it's possible to install a number of systems on a single disk and have different machines access the different installations.
- You could combine those two methods and copy a freshly installed system to a file system where you need it.
We'll look at refining this technique after the system is up and running.
Building a diskless kernel
You still need to build a special kernel for diskless workstations. The following entries in the configuration file are relevant:
#Kernel BOOTP support Options BOOTP #Use BOOTP to obtain IP address/hostname Options BOOTP_NFSROOT #NFS mount root filesystem using BOOTP info Options BOOTP_NFSV3 #Use NFS v3 to NFS mount root Options BOOTP_COMPAT #Workaround for broken bootp daemons. Options BOOTP_WIRED_TO=fxp0 #Use interface fxp0 for BOOT
Only the first two are required. If you use BOOTP_WIRED_TO, make sure that the interface name matches the network card you are using.
Build the kernel, as described on page 617. To install, you need to set the DESTDIR variable to specify the directory in which you want to install the kernel:
# make install DESTDIR=/src/nodisk/bumbl
Configuring TFTP
Next we need to set up TFTP to deliver the kernel to the system. The first question is whether the firmware on the Ethernet card can load the kernel directly or not. Some boot PROMs run in 16 bit 8086 mode, which limits their addressing capability to 640 kB. That's too small for any FreeBSD kernel, and if you try to load the kernel directly you'll get a message like this:
File transfer error: Image file too large for low memory.
In this case, you'll need to load a loader, such as pxeboot.
As a minor concession to security, the tftpd daemon refuses to access files outside its data directory hierarchy, which by convention is called /tftpboot. You can use symbolic links, however. It makes more sense to have the kernel in the same place as on machines with disks, namely in /boot/kernel/kernel on the root file system, so we create symbolic links:
# mkdir /tftpboot # In -s /src/nodisk/bumble/boot/kernel/kernel /tftpboot/kernel.bumble # ln -s /boot/pxeboot /tftpboot/pxeboo
We also need to ensure that we can start the TFTP daemon, ftpd. Unless you're constantly booting, there's no need to have it running constantly: just enable it in /etc/inetd.conf, which has the following entries in the distribution file:
#tftp dgram udp wait root /usr/libexec/tftpd tftpd -s /tftpboot #tftp dgram udp6 wait root /usr/libexec/tftpd tftpd -s /tftpboot
These are entries for IPv4 and IPv6 respectively. We enable tftpd by uncommenting the first line (removing the # character) and sending a HUP signal to inetd:
# killall -1 inetd send a SIGHUP
Configuring DHCP
We already looked at dhcpd's configuration file /usr/local/etc/dhcpd.conf on page 302, In addition to the information we looked at there, we need to know what file to load, which system to load it from, and where the root file system is located. For our diskless system bumble we might add the text in bold to the configuration we saw on page 303:
subnet 223.147.37.0 netmask 255.255.255.0 range 223.147.37.90 223.147.37.110; option domain-name-servers freebie.example.com, presto.example.com; option domain-name "example.com"; option routers gw.example.com; option subnet-mask 255.255.255.0; option broadcast-address 223.147.37.255; default-lease-time 86400; max-lease-time 259200; host sydney { hardware ethernet 0:50:da:cf:7:35; } host bumble { hardware ethernet 0:50:da:cf:17:d3; next-server presto.example.com; only if on a different machine filename "/tftpboot/bumble/kernel.bumble"; for direct booting filename "/tftpboot/pxeboot"; for PXE option root-path 223.147.37.1:/src/nodisk/bumble; } }
There are a few things to note here:
- The next-server line tells where the TFTP server is located. If it's the same as the machine running the DHCP server, you don't need this specification.
- As we've seen, hardware restrictions may make it impossible to load the kernel directly. In this case you need to load a loader. The only one that FreeBSD currently supplies is pxeboot2See http://www.freebsd.org/doc/en_US.ISO8859-1/articles/pxe/index.html for documentation for setting up pxeboot on FreeBSD. Choose one of the two filename lines.
- You have to specify the root path as an IP address, because no name services are available when the root file system is mounted.
Other Ethernet bootstraps
If your Ethernet card doesn't have a boot ROM, you can make one with the net/etherboot port, or you can copy the necessary information to a floppy disk or CD-R and use that to start the bootstrap. In either case, you first build the port and then copy the data to your selected medium. For example, to create a boot disk for a CompexRL2000 card, a 10 Mb/s PCI NE-2000 clone, you first look up the card in /usr/ports/net/ether-boot/work/etherboot-5.0.5/src/NIC,where you read:
#Compex RL2000 compexrl2000 ns8390 0x11f6,0x1401
This information is mainly for the build process; you just need to know the compexrl2000, which is the name of the driver.
# cd /usr/ports/net/etherboot # make all # cd work/ether*/src # cat bin/bootla.bin bin32/compexrl2000.lzrom > /dev/fd0
bin/bootla.bin is a disk bootstrap intended to load and start compexrl2000.lzrom.You can also put compexrl2000.lzrom in an EPROM. This requires a little more care, and the information is subject to change. You can find detailed information about how to proceed at the web site http://etherboot.sourceforge.net/doc/html/documentation.html.
etherboot uses NFS, not TFTP. As a result, things change: you can use absolute path names and you can't use symbolic links. An entry in dhcpd.conf for this method might look like this:
host bumble { hardware ethernet 00:80:48:e6:a0:61; filename "/src/nodisk/bumble/boot/kernel/kernel"; fixed-address bumble.example.org; option root-path "192.109.197.82:/src/nodisk/bumble"; }
When booting in this manner, you don't see any boot messages. The boot loader outputs several screens full of periods, each indicating a downloaded block. It finishes like this:
.........................done
After that, nothing appears on the screen for quite some time. In fact, the boot is proceeding normally, and the next thing you see is a login prompt.
Configuring the machine
Setting up a diskless machine is not too difficult, but there are some gotchas:
- Currently, locking across NFS does not work properly. As a result, you may see messages like this
Dec 11 14:18:50 bumble sm-mta[141]: NOQUEUE: SYSERR(root): cannot flock(/var/run/ sendmail.pid, fd=6, type=2, omode=40001, euid=0): Operation not supported
One solution to this problem is to mount /var as an MD (memory) file system. This is what currently happens by default, though it's subject to change: at startup, when the system detects that it is running diskless (via the sysctl vfs.nfs.disk-less_valid), it invokes the configuration file /etc/rc.diskless1.This file in turn causes the file /etc/rc.diskless2 to be invoked later in the startup procedure. Each of these files adds an MD file system. In the course of time, this will be phased out and replaced by the traditional configuration via /etc/fstab, but at the moment this file has no provision for creating MD file systems.
You should probably look at these files carefully: they may need some tailoring to your requirements. :
- It is currently not possible to add swap on an NFS file system. swapon (usually invoked from the startup scripts) reports, incorrectly:
Dec 11 14:18:46 bumble savecore: 192.109.197.82:/src/nodisk/swap/bumble: No such file or directory
This, too, will change; in the meantime, it is possible to mount swap on files, even if they are NFS mounted, but not on the NFS file system itself. This means that the first of the following entries in /etc/fstab will not work, but the second will:
192.109.197.82:/src/nodisk/swap/bumble none swap sw 0 0 /src/nodisk/swap/bumble none swap sw 0 0 echunga:/src /src nfs rw 0
The reason here is the third line: /src/nodisk/swap/bumble is NFS mounted, so this is a swap-to-file situation. For this to work, you may have to add the following line at the end of your /etc/rc.diskless2:
swapon -a
This is because the standard system startup mounts swap before mounting additional NFS file systems. If you place the swap file on the root file system, it will still work, but frequently you will want the root file system to be read-only to be able to share it between several machines.
- If the machine panics, it's not possible to take a dump, because you have no disk. The only alternative would be a kernel debugger.
Sharing system files between multiple machines
In many cases, you may have a number of machines that you want to run diskless. If you have enough disk (one image for each machine), you don't have anything to worry about, but often it may be attractive to share the system files between them. There are a lot of things to consider here:
- Obviously, any changeable data specific to a system can't be shared.
- To ensure that things don't change, you should mount shared resources read-only.
- Refer to Table 32-1for an overview of FreeBSD installed directories. Of these directories, only /etc and /usr/local/etc must be specific for a particular system, though there are some other issues:
- Installing ports, for example, will install ports for all systems. That's not necessarily a bad thing, but if you have two systems both installing software in the same directory, you can expect conflicts. It's better to designate one system, possibly the host with the disk, to perform these functions.
- If you share /boot and make some configuration changes, the options will apply to all systems.
- When building system software, you can use the same /usr/src and /usr/obj directories as long as all systems maintain the same release of FreeBSD. You can even have different kernels: each kernel build directory carries the name of the configuration file, which by convention matches the name of the system.
The big problem is /etc. In particular, /etc/rc.conf contains information like the system name. One way to handle this is to have a separate /etc directory for each system. This may seem reasonable, because /etc is only about 1.5 MB in size. In fact, this implies mounting the entire root file system with the other top-level directories, and that means more like 60MB.
Disk substitute
The other alternative to network booting is to find a local substitute for the disk. This is obviously the only alternative for a stand-alone machine. There are a number of alternatives:
- For really small systems, you can use PicoBSD, a special small version of FreeBSD that fits on a single floppy disk. It requires a fair amount of memory as RAM disk, and obviously it's very limited.
- PicoBSD is good for some special applications. As the FreeBSD kernel grows, it's becoming more and more difficult to get even the kernel onto a single floppy, let alone any application software. Still, you can find a number of different configurations in the source tree in /usr/src/release/picobsd. Be prepared for some serious configuration work.
- Alternatively, you can boot from CD-R or CD-ROM. In this case, you can have up to 700 MB of data, enough for a number of applications. It's possible to run programs directly from the CD, but there's little advantage to having files on CD instead of on disk. The most likely application for this alternative is for systems where the reliability of rotating media is insufficient, where the CD is used only for booting, and after that the system runs from RAM disk.
- Yet another alternative is Flash memory, often abbreviated simply as Flash, which we looked at in Chapter 8, on page 159. Flash is available in sizes up to several hundred megabytes, and Compact Flash cards look like disks to their interface. They don't fit IDE connectors, but adapters are available.
Flash memory is intended mainly for reading. It is much slower to write than to read, and it can only take a certain number of write cycles before it fails. Clearly it's a candidate for read-only file systems.