The question you will ask straight away is what is a master server? Most of Beowulf systems have only one server and gateway to the world outside the cluster, but some have multiple servers for performance and reliability reasons. In a large disk-less client cluster you might want to use multiple NFS servers to serve system files to the client nodes. In a more distributed environment it is possible for all nodes to act as both client and servers. If you are going to use only one server node you can simply remove the word 'master', and think of a master server simply as the server.
Master server will be the most important node in your beowulf system. It will NFS serve file systems to the client nodes, it will be used for compiling the source code, starting parallel jobs and it will be your access point from the outside world. The following are the steps to installing and configuring master server.
A very important part of the installation process is choosing the
partition sizes. It is very important to choose partition size which
are correct for your needs because it might be very difficult to
change this at a later stage when your cluster will be running
production code. I have changed the partition sizes listed below,
with just about every update of this document. You will most probably
have to experiment, but the following sizes should be OK for a 4GB
HDD, Red Hat Linux 5.2, and a 16 node, dick-less client cluster. The
following list does not include /home
partition, where you
will store your user files.
/ - 500 MB
. This /
partition will contain
/bin, /boot, /dev, /etc, /lib, /root, /sbin, /var
and
/tftpboot
directories and their contents. In most cases you
can include /tmp
in /
as well. It is very important
for the disk-less client configuration that the /tftpboot
is
on the same partition as /
. If we have these two directories
mounted on separate partitions we will not be able to create some of
the hard links which are needed for the described here NFS root
configuration to work.
/usr - 1.5 GB
. This might seem as a over-kill but
remember that most additional rmp's will install in /usr
and
not in /usr/local
. If you are planning to install large
packages, you should make /usr
partition even larger. There
is nothing worse than running out of disk space on a production system.
/usr/local - from 500 MB to 2 GB
. The exact size will
really depend on how much additional software (not included in the
distribution) you have to install.
swap
- Swapping is really bad for performance of your
system. Unfortunately there might be a time when the server is
computing a very large job and you just don't have enough memory. You
should probably make the swap partition no more than twice the size of
physical RAM. For example we have 384 MB of RAM and four 128 MB swap
partitions on node1 in our topcat system
I will not go into the details of Red Hat Linux 5.2 installation as these are well described in Red Hat Linux Installation Manual http://www.redhat.com/support/docs/rhl/. I recommend installing full Red Hat 5.2 distribution to save time now and later, when you will look for a package you'll need but did not install. If you don't have enough disk space, and don't mind spending time selecting individual packages, then you can leave out packages you don't think you'll use, like the translations of the Linux HOWTO documents.
If you haven't already done so, you should now configure both of your
Ethernet cards. One of your cards should have a "real" IP address
allocated to you by your network administrator (most probably you
:), and the other a private IP (e.g. 10.0.0.1) visible only by the
nodes within the cluster. You can configure your network interface by
either using GUI tools shipped with Red Hat Linux, or simply create or
edit /etc/system/network-scripts/ifcfg-eth*
files. A simple
Beowulf system might use 10/8 private IP address range with
10.0.0.1 being the server and 10.0.0.2 up to 10.0.0.254 being the IP
addresses of client nodes. If you decide to use this IP range you
will probably want to use 255.255.255.0 netmask, and 10.0.0.255
broadcast addresses. On Topcat eth0
is the
interface connecting the cluster to the outside world, and
eth1
connects to the internal cluster network. The routing
table looks like this:
[jacek@topcat jacek]$ /sbin/route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 10.0.0.0 * 255.255.255.0 U 0 0 9 eth1 139.x.x.0 * 255.255.248.0 U 0 0 7 eth0 127.0.0.0 * 255.0.0.0 U 0 0 2 lo default 139.x.x.1 0.0.0.0 UG 0 0 18 eth0
I no longer run DNS on Topcat (our Beowulf cluster).
Originally I though that having a dedicated DNS domain and server for
your Beowulf cluster simplified administration, but since then have
configured Topcat without DNS, and it seems to work well. It
is up to you to choose your configuration. I left this section on DNS
for reference purposes, but will no longer maintain it. I believe
that my DNS configuration files will not work with the latest version of
named
.
Setting up DNS is very straight forward. Your server (node1) will be
the DNS server. It will resolve the names and IP addresses for the
whole beowulf cluster. DNS Configuration files can be downloaded from
ftp://ftp.sci.usq.edu.au/pub/jacek/beowulf-utils. The
configuration files listed are the ones I used on our topcat
system but you can include them in your system if you don't mind use
the same names for your nodes as me. As you can see I use a private
IP address range 10.0.0.0/8, with local subnet mask set to
255.255.255.0. Our domain will not be visible from outside (unless
someone uses our node1 as their name server) so we can call it what
ever we want. I chose beowulf.usq.edu.au
for my domain name.
There are few configuration files which you will have to modify for
you DNS to work and you can find them
ftp://ftp.sci.usq.edu.au/pub/jacek/beowulf-utils . After
installing the configuration files restart the named
daemon
by executing /etc/rc.d/init.d/named restart
.
Test your DNS server :
[root@node1 /root]# nslookup node2 Server: node1.beowulf.usq.edu.au Address: 10.0.0.1 Name: node2.beowulf.usq.edu.au Address: 10.0.0.2 [root@node1 /root]# nslookup 10.0.0.5 Server: node1.beowulf.usq.edu.au Address: 10.0.0.1 Name: node5.beowulf.usq.edu.au Address: 10.0.0.5
/etc/hosts
If you decide not use DNS server, then you will have to enter all of
the nodes and their corresponding IP addresses in /etc/hosts
file. If you use disk-less client configuration, the sdct
and adcn scripts will create hard links to this file, so it will be
used by all nodes. In addition the adcn
script will add an
entry to /etc/hosts
for the client it creates root file
system for. Example /etc/hosts
file from Topcat is
shown below.
127.0.0.1 localhost localhost.localdomain 139.x.x.x topcat.x.x.x. topcat 10.0.0.1 node1.beowulf.usq.edu.au node1 10.0.0.2 node2.beowulf.usq.edu.au node2 10.0.0.3 node3.beowulf.usq.edu.au node3 10.0.0.4 node4.beowulf.usq.edu.au node4 10.0.0.5 node5.beowulf.usq.edu.au node5 10.0.0.6 node6.beowulf.usq.edu.au node6 10.0.0.7 node7.beowulf.usq.edu.au node7 10.0.0.8 node8.beowulf.usq.edu.au node8 10.0.0.9 node9.beowulf.usq.edu.au node9 10.0.0.10 node10.beowulf.usq.edu.au node10 10.0.0.11 node11.beowulf.usq.edu.au node11 10.0.0.12 node12.beowulf.usq.edu.au node12 10.0.0.13 node13.beowulf.usq.edu.au node13
/etc/resolv.conf
If you have a DNS server running on the master server then your
resolv.conf
file should point to local name server first.
This is the /etc/resolv/conf
I had when I ran DNS on
Topcat
search beowulf.usq.edu.au eng.usq.edu.au sci.usq.edu.au usq.edu.au nameserver 127.0.0.1 nameserver 139.x.x.2 nameserver 139.x.x.3
/etc/resolv/conf
file.
search eng.usq.edu.au sci.usq.edu.au usq.edu.au nameserver 139.x.x.2 nameserver 139.x.x.3
/etc/hosts.equiv
In order to allow remote shells (rsh) from any node to any other in
the cluster, for all users, you should relax the security, and list
all host in /etc/hosts.equiv
. Please see section
Security
#Assumes LAM-MPI, PVM and MPICH are installed setenv LAMHOME /usr/local/lam61 setenv PVM_ROOT /usr/local/pvm3 setenv PVM_ARCH LINUX setenv MPIR_HOME /usr/local/mpich set path = (. $path) # use egcs compilers first set path = (/usr/local/bin $path) set path = ($path /usr/local/pvm3/lib/LINUX) set path = ($path /usr/local/lam61/bin) set path = ($path /user/local/mpich/lib/LINUX/ch_p4)
There are some problems with the 2.0.X SMP kernels and clock drift. It is due to some interrupt problem. The best solution it to use xntp and synchronize to an external source. In any case, it is best to have your clusters clocks synchronized. Here is how to use xntp.
clock -w
rpm -i xntp3-5.93-2.i386.rpm
ON ALL SYSTEMS comment out the lines (as shown):
#multicastclient # listen on default 224.0.1.1 #broadcastdelay 0.008
ON NON-HOST SYSTEMS (every one but the host)
Edit the lines:
server HOSTNODE # local clock #fudge 127.127.1.0 stratum 0
where HOSTNODE is the name of the host node.
Close the /etc/ntp.conf file on each node.
7. Start xntpd and all systems, "/sbin/xntpd"
You can start this each time you boot by adding this to your /etc/rc.d/rc.local file.
It will take some time for synchronization, but you can see messages from xntpd in /var/log/messages.
What you have just done is to tell the host node to run xntp and use the local clock system clock as reference. In addition, all the other nodes in the cluster will get their time from the host.
Although xntpd is supposed to keep the system clock and the RTC in sync. It is probably a good idea to sync them once a day. You can do this by (as root) going to /etc/cron.daily and creating a file called "sync_clocks" that contains the following:
# Assumes ntp is running, so sync the CMOS RTC to OS system clock /sbin/clock -w
Now all the clocks in your cluster should be sync'ed and use the host a s a reference. If you wish to use an external reference consult the xntpd documentation.