Netfinity 4000R (AMI BIOS) Red Hat 6.1 Network Installation Notes

Overview

The purpose of this document is to help you create an installation server to install a large cluster or farm of the AMI BIOS version of the Netfinity 4000Rs.

If all goes well you should be able to power on a headless (network and power cord only) 4000R and 5-20 minutes later (depending on the size of the installation) your 4000R will be running Linux.  What you won't see:  When you power on your 4000Rs they will each get a unique DHCP assigned IP address, use PXE to get global MTFTP information (if necessary), TFTP/MTFTP download a kernel and initrd (RAM disk) image and start a kickstart installation utilizing NFS as the source.  The node name will be assigned by DNS and the time synchronized with NTP.

PXE setup is only necessary if an existing DHCP server is setup and you do not have the flexibility of modifying it.  (i.e. most customer situations.)

Simple changes to the kickstart file will allow you the flexibility to change the installation of all the servers from a central point.

It can be a bit unnerving not having the ability to monitor your installation from the console.  Use tail -f /var/log/messages from your installation server to monitor DHCP, TFTP, and NFS activity.  If you really need to watch you can redirect the kickstart installation through the serial port.

Some files have to be edited.  Use pico -w filename   if you are unfamiliar with any Linux/UNIX-based text editors.

This document assumes that you have my scripts.  If you do not have scripts.tgz you can email me at egan@us.ibm.com and I will send it to you.

I will assume that your are logged in as root.

4000R Server Outline

  1. Attach Network Cables
  2. Setup 4000Rs for Network Installation
  3. Obtain the MAC address

Attach Network Cables

The network cable must be connected to the top NIC.

Setup 4000Rs for Network Installation

  1. Take crash cart to each node
  2. Start up and press DEL
  3. Select Auto Configuration with Optimal Settings
  4. Select Advanced CMOS Setup

    Change:

    Quick Boot Enable
    1st Boot Device NETWORK
    2nd Boot Device CD-ROM
    3rd Boot Device Disable
    Try Other Boot Devices Yes
    BooUp Num-Lock Off

    Press ESC, Save Settings and Exit

  5. At the FIRST PXE-M00: message hold down both shift keys.

    Select 1 - Disable network boot
    Show initialization message (y/n)? Y

  6. You should never have to physically touch that node again.

Obtain the MAC address

Get the MAC addresses.  Also get a list of IP numbers and a label maker.

  1. Start up.
  2. Wait for the MAC address to appear.
  3. Label the front of the server (right above, but not on the Netfinity 4000R logo :-) the MAC and IP address of that server.

 

Installation Server Outline

  1. Install Red Hat Linux 6.1
  2. Setup Installation Image and Scripts
  3. Setup DNS
  4. Setup DHCP
  5. Setup PXE (optional)
  6. Setup TFTP/MTFTP
  7. Setup Bpbatch
  8. Setup Kickstart
  9. Setup NFS
  10. Setup NTP
  11. Install Nodes

Install Red Hat Linux 6.1

You will require 4GB of disk storage.  Use the Generic Netfinity Red Hat Linux 6.1 Installation Notes for assistance.

  1. Install everything!  New install, not an upgrade.
  2. When prompted for a hostname enter the fully qualified domain name (e.g. foo.bar.com).
  3. Create a /install filesystem with 1GB during or after installation.

Setup Installation Image and Scripts

  1. Copy the Red Hat Linux 6.1 CD to /install/rh61.

    Insert the Red Hat Linux 6.1 CD and type:

    mount /dev/cdrom
    mkdir /install/rh61

    cd /mnt/cdrom
    find . -print | cpio -dump /install/rh61
    cd /
    umount /mnt/cdrom
    eject cdrom


  2. Install scripts.

    Copy scripts.tgz to /tmp and type:

    mkdir /install/scripts
    cd /install/scripts
    tar zxvf /tmp/scripts.tgz
    rm /tmp/scripts.tgz
    mkdir /install/shared
    cp renamed /usr/local/bin

Setup DNS

Kickstart will fail unless it can reverse name lookup on install.  If you already have a DNS server running with all the names of your nodes entered then you can skip this step.

Download ftp://ftp.ora.com/pub/examples/nutshell/dnsbind/dns.tar.Z to /tmp and type:

cd /usr/local/bin
tar zxvf /tmp/dns.tar.Z h2n
chmod 755 h2n renamed
chown root.root h2n renamed

Edit /usr/local/bin/renamed and update DOMAIN and NETWORK.

#!/bin/bash

DOMAIN=bar.com
HOSTNAME=$(hostname)

NETWORK=172.16

/etc/rc.d/init.d/named stop

cd /etc
touch db.*
/usr/local/bin/h2n -d $DOMAIN -s $HOSTNAME -n $NETWORK -u root.$HOSTNAME.$DOMAIN

cd /etc
/usr/doc/bind*/named-bootconf/named-bootconf <named.boot >named.conf

/etc/rc.d/init.d/named start

Set DOMAIN and NETWORK to their proper values.  NOTE:  DOMAIN must equal domain sent from DHCP server.

Edit /etc/hosts and add all your node IP address and names, save & exit, then type:  /usr/local/bin/renamed.   As you add more nodes you will need to update /etc/hosts and execute /usr/local/bin/renamed.

Set DNS to start on boot:

chkconfig --level 0123456 named off
chkconfig --level 345 named on

Setup your installatoin server as a DNS client. 

Sample /etc/resolv.conf file.

search ibm.com
nameserver 172.16.1.1

search is your IP domain name and nameserver is your installation server IP address.

Setup DHCP

This step is only necessary if no DHCP servers exist or if an existing Linux-based DHCP server exists and can be modified.  It is strongly recommended that regardless of where the DHCP services reside that all nodes (4000Rs) have static addresses.  Image trying to find a down server in a farm of 100s of 4000Rs if you only had a dynamically assigned IP addresses.

Sample /etc/dhcpd.conf file:

subnet 199.88.179.0 netmask 255.255.255.0 {
#   range 199.88.179.150 199.88.179.160;
    default-lease-time             -1;

    option routers                 199.88.179.1;
    option subnet-mask             255.255.255.0;
    option nis-domain              "";
    option domain-name             "sense.net";
    option domain-name-servers     199.88.179.21;
    option time-offset             -7;

    host node1 {
        hardware ethernet 00:d0:a8:00:05:f4;
        fixed-address 199.88.179.201;

        option dhcp-class-identifier "PXEClient";
        filename "/tftpboot/bpbatch";
        next-server 199.88.179.22;
        option dhcp-server-identifier 199.88.179.22;
        option vendor-encapsulated-options 01:04:00:00:00:00:ff;
    }


    host node2 {
        hardware ethernet 00:d0:a8:00:05:f5;
        fixed-address 199.88.179.202;

       option dhcp-class-identifier "PXEClient";
        filename "/tftpboot/bpbatch";
        next-server 199.88.179.22;
        option dhcp-server-identifier 199.88.179.22;
        option vendor-encapsulated-options 01:04:00:00:00:00:ff;
    }
}

Field definitions:

subnet and netmask The subnet and netmask statements must match your current settings (check with ifconfig).
range range is commented out (#).   As stated earlier it is very undesirable to have dynamically assigned addresses.   I added the range line for documentation purposes only (and for the very lazy :-).
default-lease-time -1 denotes no expiration date.  Good for servers.
option routers This is your default gateway.
option subnet-mask No explanation needed.
option nis-domain There is a good chance that you will want to setup NIS to manage your cluster/farm.  (optional.)  NIS domains are similar to NT domains (or is it the other way around :-).
option domain-name TCP/IP domain name.
option domain-name-servers DNS Servers.
option time-offset -8 for PST, -7 for MST, -6 for CST, and -5 for EST.
host hostname {} Single host definition.
hardware ethernet The MAC address of the node (4000R).
fixed-address The fixed IP address of the node (4000R).
option dhcp-class-identifier Must be "PXEClient".
filename Must be "/tftpboot/bpbatch".
next-server IP address of TFTP Server.  Usually the same as your installation server.
option dhcp-server-identifier IP address of TFTP Server.  Usually the same as your installation server.
option vendor-encapsulated-options Must be 01:04:00:00:00:00:ff

Within your subnet range create an entry for each node (4000R).  The DHCP parameters listed as part of the host definition should not be global (i.e. dhcp-class-identifier, filename, next-server, etc...).  Other non-4000R devices (desktops, other PXE devices, X-terminals, IBM Network Stations, etc...) may also use some of all of those options.

After you have created your /etc/dhcpd.conf file type:

touch /var/state/dhcp/dhcpd.leases
/etc/rc.d/init.d/dhcpd start

Set DHCP to start on boot:

chkconfig --level 0123456 dhcpd off
chkconfig --level 345 dhcpd on

To apply future changes to /etc/dhcpd.conf type:

/etc/rc.d/init.d/dhcpd restart

Test:  Boot a 4000R with a console attached.  You should receive an IP address from your installation server (validate with DHCP IP), and the 4000R should be attempting to TFTP download a file and eventually get timeout or file not found error.  Verify that the other fields are correct.

Setup PXE

Preboot eXecution Environment (PXE) is an Intel Wired-for-Management (WFM) specification for installing over the network.  If it sounds like LCCM--that's because it is.  So, why don't we use LCCM?  Because LCCM is coded to work only with IBM desktops.

A PXE server is only required if you were unable to setup a DHCP server as described in the Setup DHCP section of this document.

If for any reason you require DHCP and PXE services on the same server (not recommended) then you must have the /etc/dhcpd.conf option dhcp-class-identifier set to PXEClient and change the UseDHCPPort setting in /etc/pxe.conf to 0 (zero).

Most DHCP servers are configured to pump out IP addresses and a few other relevant pieces of IP-based information.  Most organizations are not willing to or do not have the authority to change the DHCP server to add the additional information PXE-enabled devices require.  Some DHCP services (i.e. NetWare) are just inadequate.

When a PXE-enabled server/desktop receives an IP address but does not receive DHCP vendor encapsulated options it will broadcast for a PXE server to get that additional information.  This section discusses how to setup Linux-based PXE services.

First verify that you installed the Linux PXE service:

rpm -qa | grep pxe

If pxe-0.1-9 did not display, then you didn't install everything or you upgraded from an earlier version of Red Hat Linux.   This document assumes a full fresh install of Red Hat Linux 6.1.

Edit /etc/pxe.conf and make the following changes (in bold):

From
To
10,Press F8 to view menu ... 0,Press F8 to view menu ...
[X86PC/UNDI/MENU]
0, Local Boot
13, Remote Install Linux
[X86PC/UNDI/MENU]
13, Remote Install Linux
0, Local Boot
[X86PC/UNDI/linux-install/ImageFile_Name]
0
2
linux
[X86PC/UNDI/linux-install/ImageFile_Name]
0
0
bpbatch

Edit /etc/rc.d/rc.local and append:

route add -host 255.255.255.255 eth0
route add -net 224.0.0.0 netmask 224.0.0.0 eth0

Set PXE to start on boot:

chkconfig --level 0123456 pxe off
chkconfig --level 345 pxe on

Test:  Boot a 4000R with a console attached.  You should receive an IP address from your installation server (validate with DHCP IP), and the 4000R should be attempting to TFTP download a file and eventually get timeout or file not found error.  Verify that the other fields are correct.

Setup TFTP/MTFTP

Edit /etc/inetd.conf and uncomment (remove the #) from the following line and append /tftpboot

#tftp    dgram    udp     wait    root    /usr/sbin/tcpd     in.tftpd    /tftpboot

MTFTP is only required for PXE.  If you did not setup PXE, then do not setup MTFTP.  Append after the previous line:

mtftp    dgram    udp     wait    root    /usr/sbin/tcpd     in.mtftpd   /tftpboot

The resulting changes should be:

tftp     dgram    udp     wait    root    /usr/sbin/tcpd     in.tftpd    /tftpboot
mtftp    dgram    udp     wait    root    /usr/sbin/tcpd    in.mtftpd    /tftpboot

Append to the end of /etc/services:

mtftp    1759/udp

Edit /etc/mtftpd.conf and make the following changes (in bold):

From
To
X86PC/UNDI/linux-install/linux.0 X86PC/UNDI/linux-install/bpbatch.0

Then type:

/etc/rc.d/init.d/inet restart

Test:  Boot a 4000R with a console attached.  You should receive an IP address from your installation server (validate with DHCP IP), and the 4000R should be attempting to TFTP download a file and eventually get file not found error.  Verify that the other fields are correct.

Setup Bpbatch

Bpbatch is a very powerful network boot loader.  For more information go to http://www.bpbatch.org/.

Download http://www.bpbatch.org/downloads/bpb-exe.tar.gz and save to /tmp, then type:

cd /tftpboot
tar zxvf /tmp/bpb-exe.tar.gz "bpbatch.*"

mv -f bpbatch.P bpbatch
chown root.root *
chmod 644 bpbatch*

If PXE services are enable then type:

cd /tftpboot/X86PC/UNDI/linux-install
tar zxvf /tmp/bpb-exe.tar.gz "bpbatch.*"

mv -f bpbatch.P bpbatch.0
chown root.root *
chmod 644 bpbatch*

The script bpbatch.bpb defines the behavior of bpbatch.  Bpbatch is the network boot loader.  Ignore the CacheNever variable for now.  The first test checks for a new disk, the 2nd checks for incomplete installs, the 3rd and 4th checks for a reinstall triggered by the file /boot/install or DHCP option-131 set to "install".  If an installation is to be performed the linuxboot command will be called to start a network kickstart installation, kickstart will then reboot the computer.  If skipped, then the local O/S will boot.

The files /boot/install and /boot/install_complete are empty files that function as flags.  They may reside on each individual node, but not on the installation server.  /boot/install_complete is created after the kickstart installation is successful.  /boot/install can be created at anytime while the node is up to instruct it to reinstall on next boot.   It is very important that /boot be your first filesystem.  If / is your first filesystem, then /boot/install_complete will need to be /install_complete and you will have to edit ks.cfg to reflect that change as well.

Changes to bpbatch.bpb should be made in /install/scripts, then invoke /install/scripts/mkks to install.

Test:  Boot a 4000R with a console attached.  You should receive an IP address from your installation server (validate with DHCP IP), and the 4000R should TFTP download and launch bpbatch.  bpbatch should return a file not found error.

Setup Kickstart

Kickstart is Red Hat's automated installation processes.  For kickstart to work properly on the 4000R we have to apply a patch to make a few changes.

Patched File
Patch Description
misc/src/anaconda/loader/loader.c Kickstart hardcodes the network installation NIC to eth0.  Unfortunately the 4000R will not network boot on eth0, but uses eth1 instead.  This patch hardcodes eth1.
misc/src/anaconda/loader/init.c The serial ks option no longer works.  This patch forces serial output.
RedHat/instimage/usr/lib/python1.5/site-packages/mouse.py Mouse probing conflicts with the serial outout.  Patch to cancel probing for the mouse. 

Type:

cd /install/rh61
patch -p0 </install/scripts/ks.patch

cd misc/src/anaconda
make
cd loader
make loader init
strip loader init
cp loader init /install/scripts
cd /install/rh61/misc/src/trees
cp initrd-network.img /install/scripts
cp boot/vmlinuz /tftpboot

cd /install/scripts

ks.cfg is the kickstart configuration file.  You will need to edit it.

  1. In the nfs section change --server to equal the IP address or name of your installation server.
  2. You may want to change the partitioning.
  3. Select the correct timezone.
  4. Select a root password.
  5. Set the MIP variable on the %post section to equal address to the installation server.

Anytime you make changes to /install/scripts/ks.cfg you will need to re-invoke /install/scripts/mkks to install the changes.

Kickstart is a great tool with lots of features.  For more information read the kickstart how-to at http://wwwcache.ja.net/dev/kickstart.

Test:  Type:  rm /tftpboot/serial.  Boot a 4000R with a console attached.  You should receive an IP address from your installation server (validate with DHCP IP), and the 4000R should TFTP download and launch bpbatch.  bpbatch should linux boot and start kickstart.  Kickstart should stall because NFS is not setup.

Setup NFS

Kickstart uses NFS for network installations.

edit /etc/exports and add:

/install/rh61 *(ro,no_root_squash)
/install/shared *(ro,no_root_squash)


Save & exit.  Then type:

/etc/rc.d/init.d/nfs start

Set NFS to start on boot:

chkconfig --level 0123456 nfs off
chkconfig --level 345 nfs on

Any future changes to /etc/exports will require that you invoke: exportfs -a.

Test:  Type:  rm /tftpboot/serial.  Boot a 4000R with a console attached.  You should receive an IP address from your installation server (validate with DHCP IP), and the 4000R should TFTP download and launch bpbatch.  bpbatch should linux boot and start kickstart and complete an installation.   Your clocks should be out of sync.

Setup NTP

NTP will sync your node clocks with the installation server.

Type:

/etc/rc.d/init.d/xntpd start

Set NTP to start on boot:

chkconfig --level 0123456 xntpd off
chkconfig --level 345 xntpd on

Test:  Verify after installation that your clocks are in sync.

Install Nodes

Type:

/install/scripts/mkks

To monitor the installation through the serial port type:

echo "on" >/tftpboot/serial

To monitor the installation on the console press "C" at the console when prompted or type:

rm -f /tftpboot/serial

Power on all the nodes.

egan@us.ibm.com