xCAT Mini HOWTO (WIP)
This document is for xCAT 1.2.0.
x86 (i386, i486, i586, i686) supported distributions:
- Red Hat 7.2
- Red Hat 7.3
- Red Hat 8.0
- Red Hat 9
- Red Hat Enterprise Linux AS 2.1
- Red Hat Enterprise Linux AS 2.1 U2
- Red Hat Enterprise Linux AS 2.1 U3
- Red Hat Enterprise Linux ES 2.1*
- Red Hat Enterprise Linux WS 2.1*
- Red Hat Enterprise Linux AS 3
- Red Hat Enterprise Linux ES 3*
- Red Hat Enterprise Linux WS 3*
- Red Hat Enterprise Linux AS 3 U1
- Red Hat Enterprise Linux ES 3 U1*
- Red Hat Enterprise Linux WS 3 U1*
- Red Hat Enterprise Linux AS 3 U2
- Red Hat Enterprise Linux ES 3 U2*
- Red Hat Enterprise Linux WS 3 U2*
- Red Hat Enterprise Linux AS 3 U3
- Red Hat Enterprise Linux ES 3 U3*
- Red Hat Enterprise Linux WS 3 U3*
- Red Hat Enterprise Linux AS 3 U4*
- Red Hat Enterprise Linux ES 3 U4*
- Red Hat Enterprise Linux WS 3 U4*
- Red Hat Enterprise Linux AS 4*
- Red Hat Enterprise Linux ES 4*
- Red Hat Enterprise Linux WS 4*
- Red Hat Fedora Core 1*
- Red Hat Fedora Core 2*
- Red Hat Fedora Core 3*
- CentOS 3.3 (Treat as RHAS3U3)
- CentOS 3.4 (Treat as RHAS3U4) (CD and DVD)
- SuSE 8.1*
- SuSE 8.2*
- SuSE 9.0*
- SuSE 9.1*
- SuSE 9.2* (DVD Version only, non DVD missing KSH, 32-bit EM64T & Opteron
Tested)
- SuSE SLES8
- SuSE SLES8 SP1
- SuSE SLES8 SP2a
- SuSE SLES8 SP3
- SuSE SLES9
- SuSE SLES9 SP1
- SystemImager
- Partimage
x86_64 (Opteron and EMT64) supported distributions:
- Red Hat Enterprise Linux AS 3*
- Red Hat Enterprise Linux ES 3*
- Red Hat Enterprise Linux WS 3*
- Red Hat Enterprise Linux AS 3 U1*
- Red Hat Enterprise Linux WS 3 U1*
- Red Hat Enterprise Linux AS 3 U2*
- Red Hat Enterprise Linux ES 3 U2*
- Red Hat Enterprise Linux WS 3 U2*
- Red Hat Enterprise Linux AS 3 U3* (64-bit EM64T & Opteron Tested)
- Red Hat Enterprise Linux ES 3 U3* (64-bit EM64T & Opteron Tested)
- Red Hat Enterprise Linux WS 3 U3* (64-bit EM64T & Opteron Tested)
- Red Hat Enterprise Linux AS 3 U4* (64-bit EM64T & Opteron Tested)
- Red Hat Enterprise Linux ES 3 U4* (64-bit EM64T & Opteron Tested)
- Red Hat Enterprise Linux WS 3 U4* (64-bit EM64T & Opteron Tested)
- Red Hat Enterprise Linux AS 4*
- Red Hat Enterprise Linux ES 4*
- Red Hat Enterprise Linux WS 4*
- Red Hat Fedora Core 1*
- Red Hat Fedora Core 2*
- Red Hat Fedora Core 3* (64-bit EM64T & Opteron Tested)
- CentOS 3.3 (Treat as RHAS3U3) (64-bit EM64T & Opteron Tested)
- CentOS 3.4 (Treat as RHAS3U4) (64-bit EM64T & Opteron Tested) (CD and
DVD)
- SuSE 9.0*
- SuSE 9.1*
- SuSE 9.2* (DVD Version only, 64-bit EM64T & Opteron Tested)
- SuSE SLES8
- SuSE SLES8 SP2
- SuSE SLES8 SP3
- SuSE SLES9 (64-bit EM64T & Opteron Tested)
- SuSE SLES9 SP1 (64-bit EM64T & Opteron Tested)
- SystemImager
- Partimage
IA64 (Itanium 1 and 2) supported distributions:
- Red Hat 7.2
- Red Hat Enterprise Linux AS 2.1 U2*
- Red Hat Enterprise Linux AS 3*
- Red Hat Enterprise Linux ES 3*
- Red Hat Enterprise Linux WS 3*
- Red Hat Enterprise Linux AS 3 U1*
- Red Hat Enterprise Linux WS 3 U1*
- Red Hat Enterprise Linux AS 3 U2*
- Red Hat Enterprise Linux ES 3 U2*
- Red Hat Enterprise Linux WS 3 U2*
- Red Hat Enterprise Linux AS 3 U3*
- Red Hat Enterprise Linux ES 3 U3*
- Red Hat Enterprise Linux WS 3 U3*
- Red Hat Enterprise Linux AS 3 U4*
- Red Hat Enterprise Linux ES 3 U4*
- Red Hat Enterprise Linux WS 3 U4*
- Red Hat Enterprise Linux AS 4*
- Red Hat Enterprise Linux ES 4*
- Red Hat Enterprise Linux WS 4*
- SuSE SLES8
- SuSE SLES8 SP2
- SuSE SLES8 SP3
- SuSE SLES9*
- SuSE SLES9 SP1*
PPC64 (IBM JS20 only) supported distributions:
- Red Hat Enterprise Linux AS 3 U2*
- Red Hat Enterprise Linux AS 3 U3*
- Red Hat Enterprise Linux AS 3 U4*
- Red Hat Enterprise Linux AS 4*
- Red Hat Enterprise Linux ES 4*
- Red Hat Enterprise Linux WS 4*
- SuSE SLES8 SP3aa*
- SuSE SLES9*
- SuSE SLES9 SP1*
* Node install tested only, however should work as management node.
This HOWTO is for xCAT experts. Please be very familiar with
the
xCAT 1.1.0 Redbook.
NOTE: The term noderange refers to xCAT's
internal facility to perform an operation on a range of nodes, please read the
noderange.1 man page for details.
- Install management node.
Install complete OS (all packages). i.e. ALL PACKAGES. Repeat
ALL PACKAGES. Life is too short and disk space too cheap not to install
all the packages. The top issue with xCAT is missing packages.
xCAT is written in scripts so it is difficult to list all the dependencies and
they change frequently. I have been asked many times for a list of
dependencies. So, here they are:
ALL PACKAGES
HINT: With RH package selection, select "Custom", then scroll down to
the bottom and select "Everything".
HINT: SuSE does not have an "Everything" option, so you must manually
select all package groups. After selecting all package groups, you will
not get all the packages. The most common missing packages are: expect
and pdksh. You can select expect and pdksh from the top right window
pane during install. HOWEVER, do not manually select all packages in the
right window pane. Confusing huh?
Please read the
susemgtnode-HOWTO for more info on SuSE installs and the
xCAT 1.1.0 Redbook and the xCAT HOWTO for
Red Hat installs.
IANS: Install all packages and updates and have large
/install and
/var file systems.
NOTE: A few words about Java:
For some ASMA, RSA, RSA2, and Bladecenter functions xCAT uses IBM's
mpcli and mpcli2 utilities (included in the xcat-dist-ibm tarball).
Both utilities require Java. The Java included with both tools only
work with older RH x86 distributions. SuSE includes a functioning Java
for all four xCAT supported architectures (x86, x86_64, IA64, and PPC64) and
has been tested. However, RH does not provide a functioning Java.
If you wish to install or use a different Java, just install and create a link
to $XCATROOT/java/$ARCH, where
$ARCH = x86,
x86_64,
ia64, or
ppc64. E.g.:
Install IBM Java in /usr/ibm/java
cd /opt/xcat/java
ln -s /usr/ibm/java x86
ls -l
total 1
lrwxrwxrwx 1 root root 13 Jul 21 18:19
x86 -> /usr/ibm/java
Some good Java for x86, x86_64, and PPC64:
https://www6.software.ibm.com/dl/lxdk/lxdk-p
Java for IA64? If you find a good one one let me know. SuSE
includes it. BTW the x86 versions will run on IA64 natively--slow--but
works OK for systems management.
- Extract xCAT tarballs in /opt.
cd /opt
tar zxvpf /tmp/xcat-dist-core-1.2.0.tgz
tar zxvpf /tmp/xcat-dist-oss-1.2.0.tgz
tar zxvpf /tmp/xcat-dist-ibm-1.2.0.tgz
- Setup xCAT.
export XCATROOT=/opt/xcat
cd $XCATROOT/sbin
./setupxcat
- Logout and login is as root.
- Enable time services (xntpd) on
management node.
mv -f /etc/ntp.conf /etc/ntp.conf.ORIG
Create a new /etc/ntp.conf:
server 127.127.1.0
fudge 127.127.1.0 stratum 10
driftfile /etc/ntp/drift
Red Hat: Set time, date, and time zone with
setup:
setup OR
date
setclock OR hwclock -w
chkconfig --level 345 ntpd on
service ntpd restart
SuSE: Set time, date, and time zone with yast:
yast
OR date
clock -w
chkconfig -a xntpd
rcxntpd restart
Test (NOTE: it can take a few minutes before
xntpd is working), type:
ntpdate -q localhost
If working you should receive the following output:
server 127.0.0.1, stratum 2, offset -0.000002, delay 0.02570
22 Jan 08:04:24 ntpdate[14540]: adjust time server 127.0.0.1 offset -0.000002
sec
If not working you will receive the following output (try again later
or fix):
no server suitable for synchronization found
- Define Cluster. Define /opt/xcat/etc/* (look to
/opt/xcat/samples/etc/*). Read
the
xCAT 1.1.0 Redbook for more information. HINT:
Everything is a node. Every node, switch, terminal server, node NIC,
EVERYTHING is a node in xCAT.
Required tables:
site.tab
nodehm.tab
nodelist.tab
nodepos.tab
noderes.tab
nodetype.tab
passwd.tab
postscripts.tab
postdeps.tab
snmptrapd.conf
networks.tab
mac.tab (loaded with non-collectable MACs, e.g. terminal servers,
switches, RSAs, etc...)
Required tables for clusters with terminal servers or SOL (Server Over Lan):
conserver.tab
conserver.cf
Required tables for clusters using Ethernet switches to collect MAC addresses
(use the correct table for your switch):
cisco.tab
summit48i.tab
blackdiamond.tab
Required tables for IBM xSeries Management Processor:
mpa.tab
mp.tab
Required table for APC Master Switch:
apc.tab
Required table for APC Master Switch Plus:
apcp.tab
Required table for xCAT flash support:
nodemodel.tab
Required table for EMP support:
emp.tab
Required table for Baytech support:
baytech.tab
Required table for xCAT GPFS support:
gpfs.tab
Table for IPMI support. Required for systems that have a different IPMI
IP address than node address (e.g. e325):
ipmi.tab
- Rerun setup xCAT again. NOTE:
site.tab must be properly setup for
xcatd. Please review the
samples/etc/site.tab.
export XCATROOT=/opt/xcat
cd $XCATROOT/sbin
./setupxcat
- Configure all nodes. Please read the
stage1-HOWTO for more information.
IANS:
Update all firmware to latest levels.
Configure firmware/BIOS/CMOS to NEVER prompt or pause for anything.
Configure network to boot before HD.
Enable management processor.
Redirect POST/BIOS out serial if possible.
NOTE: For Bladecenter use
rbootseq, e.g.:
rbootseq noderange c,f,n,hd0
- Define /etc/hosts. For each
node define an IP and a name, for each interface other than the primary
interface (e.g.
eth0) define an IP and name with a -interface
suffix.
E.g.: for eth0 as the primary
interface.
192.168.1.1 node001
172.20.1.1 node001-eth1
10.10.1.1 node001-eth2
172.30.1.1 node001-myri0
E.g.: for eth1 as the primary
interface.
192.168.1.1 node001-eth0
172.20.1.1 node001
10.10.1.1 node001-eth2
172.30.1.1 node001-myri0
NOTE: Do NOT pad IP address (e.g. 172.020.001.001)--that's
insane.
NOTE: This naming convention for multiple NICs must be strictly
adhered to. Your node names can follow any naming convention, but the
interface suffix must be as illustrated in the above examples.
NOTE: There is little value in adding additional entries for the
fully qualified domain name. More often than not it creates confusion
and problems. Let DNS do it for you. EXCEPTION: The
each node's (master too) /etc/hosts table should have the FQDN for that nodes
entry.
- Build a DNS server (this is not an option):
makedns
- Enter non-collectable MACs in $XCATROOT/etc/mac.tab.
(E.g. terminal servers, switches, RSAs, etc...)
NOTE: Some network devices (e.g. APC Master Switch) do not have
the MAC address affixed to the unit. Some (e.g. APC Master Switch) have
the MAC printed on a piece of receipt paper and stuffed in the manual. Hopefully
you didn't install all the APCs and chuck the manuals in a pile somewhere.
The morale of this story is that before you rack anything please verify the
that MAC address is visible and will be visible when racked. Very cool
network devices (e.g. APC Master Switch and RSA) have a serial port, you can
use this to get the MAC.
NOTE: Manual non-collectable MAC entries in
mac.tab do not require a
-eth0 appended--it's optional.
- Build a DHCP server:
makedhcp --new --allmac
NOTE: The dhcpver
field in $XCATROOT/etc/site.tab must
be set to match the version of dhcpd
installed. Generally 2 for
older Red
Hat and 3 for SuSE and newer Red Hat before you run
makedhcp. If incorrect, correct
and rerun makedhcp --new --allmac.
NOTE: $XCATROOT/etc/networks.tab
must define each network that dhcpd
is to support. Let makedhcp
build it for you the first time, edit and rerun
makedhcp --new --allmac.
- Configure all Ethernet switches, please block DHCP in and out bound on
ports that are used to uplink the cluster to the real world. Please read
the
xCAT 1.1.0 Redbook, the cisco2950-HOWTO,
and the force10-HOWTO for more information.
- Configure all Terminal Servers. Please read the
terminalserver-HOWTO.
- Restart conserver (only if using terminal servers or SOL, Bladecenter
without SOL do not use conserver):
RH:
service conserver restart
SuSE:
rcconserver restart
- Setup stage boot image:
For x86 and x86_64 type:
cd /opt/xcat/stage
./mkstage
For ia64 type:
cd /opt/xcat/stage
./mkstage-ia64
- Manually reboot each node. Collect MAC addresses:
getmacs noderange
E.g.:
getmacs
compute
node1-eth0 00:07:E9:93:F8:DD
node1-eth1 00:00:5A:9A:DB:7C
node2-eth0 00:07:E9:93:F8:DD
node2-eth1 00:00:5A:9A:DB:7C
...
Auto merge mac.lst with /opt/xcat/etc/mac.tab(y/n)? y
Each node will be suffixed with the interface of the collected MAC.
Please do not alter.
NOTE: Do not alter the
mac.tab entries for collected MACs. It is critical that the
stored node names remain untouched. If necessary changing the MAC is OK.
NOTE: Multiple getmacs
commands will corrupt mac.tab.
Only run one instance at a time.
NOTE: Some OSes report eth0
and eth1 different than xCAT
getmacs collect. You may need
to reverse manually in mac.tab.
E.g. (this may hose other good non-switched entries, think before you do :-):
perl -pi -e 's/(nodeprefix.*)-eth0/$1-ethfoo/'
mac.tab
perl -pi -e 's/(nodeprefix.*)-eth1/$1-eth0/' mac.tab
perl -pi -e 's/(nodeprefix.*)-ethfoo/$1-eth1/' mac.tab
NOTE: Currently only the serial-based (rcons)
method of connecting MACs will collect multiple MAC/node. A future
version of xCAT will address this limitation. EXCEPTION:
Bladecenter mpcli2 and
bcmm getmacs methods can
collect both MAC addresses.
NOTE: For Bladecenter please use
bcmm method in
nodehm.tab.
- Build /etc/dhcpd.conf:
makedhcp --allmac
- For all IBM xSeries nodes with IBM management processors and the IBM e325/e326 (read
managementprocessor-HOWTO for
more info): EXCEPTION: Bladecenter (just use
mpname noderange).
nodeset noderange
stage3
- Reboot each node manually after all MACs collected and DHCP server restarted.
- Read the
managementprocessor-HOWTO and
bladecenter-NOTES for information on testing and troubleshooting
all nodes management processors if applicable.
- Test systems management:
rpower noderange
stat
and/or
rbeacon noderange on
(if blinking lights entertain you -- NOTE: not all servers have
a blinking light.)
- Copy CDs:
copycds (follow prompts)
- Copy xCAT post installation files:
cd /opt/xcat
find post -print | cpio -dump /install
- Generate root SSH keys:
gensshkeys root
- Update /etc/exports with
/install, restart NFS:
echo "/install *(ro,async,no_root_squash)"
>>/etc/exports
Red Hat:
chkconfig --add nfs
service nfs restart
SuSE:
chkconfig -a nfsserver
rcnfsserver restart
- Create Myrinet RPM. (You may need to install a node first.) Read myrinet-HOWTO.
- Edit (*.tmpls) to taste.
Read the nodeinstall-HOWTO and
systemimager-HOWTO for details.
- Got disk? Install nodes. Use rinstall
or winstall. Only install 32 at
a time or use staging. Read man pages on rinstall
and winstall, e.g.:
winstall -t 8 node001-node032
NOTE: Diskless option
diskless-HOWTO.
- Collect SSH host keys after install:
makesshgkh noderange
- Test psh verify that the dates
match:
psh noderange
date;date
- Build GM routes (version < 2.0 only):
makegmroutes noderange
If you have multiple different Myrinet networks, consider using xCAT's
post install directory sync facility and place them in
opt/gm/routes.
- Install Torque and Maui on a user node. Place
torque-1.1.0p0.tar.gz and
maui-3.2.6p9.tar.gz in /tmp:
cd /tmp
/opt/xcat/build/torque/torquemaker torque-1.1.0p0.tar.gz scp
/opt/xcat/build/maui/mauimaker maui-3.2.6p9.tar.gz
genpbs noderange
. /etc/profile.d/pbs.sh
showq (you
should see all your nodes)
pbstop (you
should see all your nodes)
- Add cluster users on usermaster
as defined in $XCATROOT/etc/site.tab,
then push out to rest of cluster:
addclusteruser
Enter username: bob
Enter group: users
Enter UID (return for next): 501
Enter absolute home directory root: /home
Enter password (blank for random): B0vHw0bL
cd /etc
cp passwd passwd.CYA
cp group group.CYA
cp /etc
prsync -craz passwd group noderange,-$(hostname
-s):/etc
NOTE: prsyncing
the passwd and
group files may be unsafe. Use
pushuser as an alternative.
E.g.:
pushuser noderange bob
- Test Torque/Maui. Login as a user and type:
bob@head01:~> qsub -l
nodes=2,walltime=1:00:00 -I
qsub: waiting for job 0.head01.foobar.org to start
qsub: job 0.head01.foobar.org ready
----------------------------------------
Begin PBS Prologue Thu Dec 19 14:17:53 MST 2002
Job ID: 0.head01.foobar.org
Username: bob
Group: users
Nodes: node10 node9
End PBS Prologue Thu Dec 19 14:17:54 MST 2002
----------------------------------------
Note the Nodes: line.
Try to ssh from node to node and back
to the user node that started qsub:
bob@node10:~> ssh node9
bob@node9:~> exit
logout
Connection to node9 closed.
bob@node10:~> ssh head01
bob@head01:~> exit
logout
Connection to head01 closed.
bob@node10:~> exit
logout
qsub: job 0.head01.foobar.org completed
Now try to ssh back to the
nodes that were assigned, you should be denied:
bob@head01:~> ssh node9
14653: Connection close by 199.88.179.209
- Install compilers and libaries. Read the
xCAT HPC Benchmark HOWTO for details.
- Install MPICH-GM on user nodes for application development. Read myrinet-HOWTO.
- Get a HPL benchmark results and submit to IBM and top500.org. Read
the
xCAT HPC Benchmark HOWTO for details.
- Enjoy your cluster. Do some work.
Support
http://xcat.org
Egan Ford
egan@us.ibm.com
January 2005