bWatch is a GUI Beowulf Cluster Monitor. It displays load averages, memory, swap, number of processes and users for all nodes in a single window. bWatch is available from http://www.sci.usq.edu.au/staff/jacek/bWatch.
NOTE: The bwatch.rpm shipped with the S.u.S.E Linux distribution
installs in /usr/X11R6/bin
and assumes that the wish
interpretor is also installed under /usr/X11R6/bin
. Red Hat
Linux installs wish
under /usr/bin
, hence bWatch
won't run. You can simply edit the first line of
/usr/X11R6/bin/bWatch
and change it from #!/usr/X11R6/bin/wish
to
#!/usr/bin/wish
.
One way of obtaining statistics from your beowulf cluster is via httpd
running on your server node, and a CGI script. The idea is that the
CGI script executes remote shells to the node your are querying, and
formats the retrieved information into a HTML page which the httpd
server sends to your browser. This is a very easy way of checking the
system performance from anywhere in the world as long as there is a
browser and an Internet connection. There is an example
index.html
file at
ftp://ftp.sci.usq.edu.au/pub/jacek/beowulf-utils which calls
the CGI script getinfo.cgi
.
Netpipe is a very good network performance testing tool which enables to check the throughput of TCP, MPI, and PVM of different size packts. You can use gnuplot or a spreadsheet to plot the results produced by Netpipe. You can find NetPIPE at http://www.scl.ameslab.gov/Projects/ClusterCookbook/nprun.html
Source: http://www.netperf.org/netperf/NetperfPage.html
Run Script:
./netperf -t UDP_STREAM -p 12865 -n 2 -l 60 -H NODE -- -s 65535 -m 1472 ./netperf -t TCP_STREAM -p 12865 -n 2 -l 60 -H NODE
NODE is the remote node name.
Source: http://www.nas.nasa.gov/NAS/NPB/
There is a package called CMS (Cluster Management System). It is available from http://smile.cpe.ku.ac.th/software/scms/index.html. This version is new, and we have not had time to test it. The previous version worked well except for the remote (real time) monitoring. It does include a system reboot and shutdown feature.