Les documents exécutifs pour l'installation de xCAT sont:

  • xCAT-mini-HOWTO.html;
  • nodeinstall-HOWTO.html.

Ces documents se trouvent dans le répertoire doc de xCAT, qui est archivé dans le paquet xcat-dist-doc-[version].tgz.

Il est important de vérifier ces deux documents lors de l'installation d'une nouvelle version de xCAT, car chaque nouvelle version peut avoir des particularités par rapport aux versions précédentes,

Installation de xCAT

Suivent les notes spécifiques de l'installation de xCAT-1.1.7 sur le cluster de GIREF. Cette installation a été exécutée en décembre 2002. Depuis, il y a eu une mise à jour à 1.2.0-pre2b, qui a impliqué des petits changements aux fichiers de configuration de xCAT spécifiques à l'installation locale.

1) Partitioning:
/		5G
/opt		5G
/boot		100M
/tmp		2G
/var		2G
/install	1.5G
swap		1G
/home		54G (remainder)

2) configure root: (passwd deleted)
   configure galileo user (passwd deleted)

3) configure NIS:

4) activate NIS server (isn't activated by default)
	NOTE: perhaps will replace with LDAP server

5) put galileo name in GIREF's DNS.

At this point, while trying to copy the rh73 CDROMs on the disk, it
became obvious that the partitioning layout is wrong

28Nov2002 - 10Dec2002:
0) packages selection is important:
	- anaconda
	- dhcp
	- expect
	- http
	- lynx
	- pdksh
	- php (for ganglia)
	- tftpd, tftp
	- ucd-snmp
	- uucp

1) Partitioning:
/		15G
/tmp		 3G
/var		 3G
/boot		200M
swap		 2G
/home		48G (remainder)

2) name: n01.galileo.lan

3) configure root: (passwd ...deleted...)
   configure galileo user: (passwd ...deleted...)

4) configure NIS:
	domain galileo.lan
   NOTE: apparently, it doesn't work at all.
	And I don't have the right communication
	with the ether switch

5) transfer rh73 CDROMs from merlin to n01

6) transfert rh73 updates from to n01.
	Also did cleanup of updates.
	Script helps selecting what to update.
	Install and make boot new kernel.

7) install xcat (transfer from gandalf)

8) configure xcat:
	- cp /opt/xcat/samples/xcat.{csh,sh} /etc/profile.d/
	- echo "XCATROOT=/opt/xcat" >/etc/sysconfig/xcat
	- create /opt/xcat/etc/
	- create /etc/hosts
	- create /opt/xcat/etc/
	- create /opt/xcat/etc/
	- create /opt/xcat/etc/
	- create /opt/xcat/etc/
	- create /opt/xcat/etc/
	- create /opt/xcat/etc/ using the cluster database (from merlin)
	- create /opt/xcat/etc/
	- create /opt/xcat/etc/
	- create /opt/xcat/etc/ (whatever ??)
	- create /opt/xcat/etc/ (whatever ??)
	- create /opt/xcat/etc/

   NOTE: I get lost on MPN configuration and other resources:,,,,,,

9) on n01, deactivate services: kudzu apmd autofs iptables ipchains rawdevices lpd rhnsd:
	for s in $list;do /sbin/chkconfig --level 0123456 $s off;done

10) on n01, configure syslog for remote loging permission:
	- in /etc/sysconfig/syslog set SYSLOG_OPTIONS="-m 0 -r"

11) activate and configure snmpd:
	/sbin/chkconfig --level 345 snmptrapd on
	cp /opt/xcat/samples/etc/snmptrapd.conf /etc/snmp

12) add some needed aliases:
	echo "root: galileo" >> /etc/aliases
	echo "alerts: galileo," >> /etc/aliases

13) activate, configure and test tftpd:
	- edit /etc/xinetd.d/tftp file to:
		- remove the root jail (replace -s /tftpboot by -s /)
		- add logging (-v -v)
	NOTE: the above is needed for the way xcat prepares the pxe boot

	/sbin/chkconfig tftp on
	mkdir /tftpboot
	echo "Hi hi" >/tftpboot/test
	tftp n01
		get /tftpboot/test
	rm /tftpboot/test

14) configure NFS:
	echo "/install *(ro,no_root_squash)" > /etc/exports
	echo "/opt/xcat,no_root_squash)" >> /etc/exports
	echo "/usr/local,no_root_squash)" >> /etc/exports
	echo "/home,no_root_squash)" >> /etc/exports
	/sbin/chkconfig --level 345 nfs on

15) activate and configure NTP:
	echo "restrict mask nomodify notrap noquery"> /etc/ntp.conf
	echo "server">> /etc/ntp.conf
	echo "">> /etc/ntp.conf
	echo "restrict mask notrust nomodify notrap">> /etc/ntp.conf
	echo "restrict mask notrust nomodify notrap">> /etc/ntp.conf
	echo "restrict mask notrust nomodify notrap">> /etc/ntp.conf
	/sbin/chkconfig --level 345 ntpd on

16) checked the ssh config.
	gensshkeys root

17) config DNS:
	/sbin/chkconfig --level 345 named on

18) configure DHCP:
	NOTE: dhcp package has to be installed
	- in /etc/sysconfig/dhcpd, add DHCPDARGS="eth0", which makes sure that
		only requests from the cluster get answered
	/sbin/chkconfig --level 345 dhcpd on
	/opt/xcat/sbin/makedhcp --new
	/opt/xcat/sbin/makedhcp --allmac

18b) configure NIS server:
	/sbin/chkconfig --level 345 ypserv on
	/sbin/service ypserv start

	/sbin/chkconfig --level 345 ypbind on
	/sbin/service ypbind start

	cd /var/yp; make

19) reboot n01!!
	NOTE: the ether switch finally recognizes the node as 1000Mb

20) configured ether switch:
	- new IP
	- new alert1: to
	- new name: ether

21) make first stage:
	copy all CDs in /install/rh73
	cd /opt/xcat/build/rh73
	./applypatch; ./e1000patch; ./nofibre
	cd /opt/xcat/netboot
	./mknb --update
	cd /opt/xcat/stage
	NOTE: this creates some files in /tftpboot

22) fix the kickstart templates:
	cd /opt/xcat/ks73 (check that the files you need are here and their content)

23) prepare the postinstall dir:
	mkdir /opt/xcat/post
	mkdir /opt/xcat/post/updates
	mkdir /opt/xcat/post/updates/rh73
	get all updates from or
	cd /opt/xcat/post
	cp -vr gm-routes kernel rc.d rpm73 sync /install/post

24) prepare kernel:
	cd /usr/src/linux-2.4
	make mrproper
	cp configs/*i686-smp* .config
	vi Makefile (replace -10custom with -10smp)
	make menuconfig
	make dep
	make modules &
	sleep 40; kill %

25) compile asm driver (so that mpcli and sp tools work)
	look for updates on
	rpm -Uvh ibmasm-src-redhat
	rpm -Uvh /usr/local/ibmasm/ibmasm-1.06*
	sp ReadLog (for test)

26) prepare the mpa

27) configure the ASMs:
	nodeset compute stage3
	- then reboot all nodes, with about 20s pause between each

28) Install myrinet:
	# - download gm-1.6.3_Linux from
	# - create rpm for cluster using /opt/xcat/build/gm/gmmaker - failed
	- download gm- from (according to advice on xcat-user ml)
	- create rpm for cluster using /opt/xcat/build/gm/gmmaker
	- install gm rpm on n01
	- copy gm rpm to /install/post/kernel

29) Install ganglia:
	- install gmond*.rpm, gmetad*.rpm, webfront*.rpm on n01
	- copy gmond*.rpm in /install/post/rpm73/
	- change /etc/gmond.conf to put the name and the owner of the cluster
	- change /etc/gmetad.conf to indicate source "localhost"
	/sbin/chkconfig httpd --level 345 on
	/sbin/service httpd start

SUPPLEM: In order to use distcc, I had to modif
	 the kickstart file, to add the "@ Software Development" package.

SUPPLEM: Need to install: 
	- ICC
	- mpich-gm
	- petsc
	- Atlas

Second install.
	NOTE: new MAC for RSA: 00:09:6B:0A:24:D2