As I made changes to configuration files like mosix.map and distributed them about the cluster during my initial configurations of ralphzilla it occurred to me that this is precisely the kind of thing that is far better handled with a script. (Some may wonder why I'm talking about scripts when I've been discussing programming. This is generally a matter of convention. Little programs that are run as part of system administration efforts are generally referred to as scripts. There is no real difference at the most basic level between what you do when you write scripts and what you do when you write programs. Large programs are structured aggregates of small chunks of code that are of the same scale as the scripts here. Generally the primary difference is that those chunks contain some provision for values passed into them and to return values to the piece of code that called them. While some might argue that there is a bigger difference than this, they are largely splitting hairs at a level that is functionally meaningless. Most larger entities that are called programs have chunks within them that would be called scripts if executed independently. What difference exists is largely a matter of the manner in which the code is used, and is not a distinction that you should worry about. As will become clearer when we get into the database application, whatever distinction does exist pretty well evaporates as you begin to do real stuff.)
I initially approached the problem by attempting to copy through the mfs interface, but ran into some behavior at the time of the copying operation that I considered unpredictable. This was conceptually the most simple of the three scripts I've written to handle this chore, but it did leave unaddressed the activation of the changes contained in those files by re-starting mosix. In effect, the script simply iterated through an array of the numeric directories under the mfs mount point. (An array is a list holding a series of values. You might think of an array as a gear whose teeth hold values you've stored there, or as a series of slots in a message board into which slips of paper holding certain values are placed. Metaphors like this can have real value as you envision what a program does, especially if you are writing one. Much of my own initial work with computers and programming involved databases and data analysis, and I found the gear metaphor particularly useful. because I would use similar constructs to link groups of databases to a particular purpose. If you do find metaphors like these useful, however, always keep in mind that they are no more than that, and that you can reach a point at which a metaphor can cloud your perception of an alternative formulation of whatever you are trying to do.)
The numbers under the mfs mount point of course represent the mosix numbers of the machines to which they link. The script originally went through the elements of the array, pattern-matching for a digit in the first position, and used that integer as the basis for the construction of a scalar (memory variable) that held the name of the destination file along with its full path relative to the mfs mount point (e.g., /mfs/3/etc/mosix/mosix.map). When the script executed, however, I received permission errors that I have subsequently come to suspect are associated with the problems that developed with the authentication modules. While that could be fixed readily enough, as I watched the script's behavior I came to realize that there was a problem with the strategy of simply reading the mosix numbers from the directory tree.
A few weeks ago I started having trouble with one of the machines in the cluster (ralphzilla-faxa). I reconfigured it once, only to have it develop certain problems again. Rather than spending time with a third configuration, I decided to regard this as an opportunity to deal with the cluster in a situation in which one machine is not present. My thought was that this would tend to make the document more robust and dynamic, which is an aspect of the rationale for clusters in the first place. This script itself could be viewd in that context. As I watched the values stored in scalars change as the code executed in a debugging session I realized that the script was reading the value "2" even though no "2" was displayed in the directory ("2" is the mosix number for the absent ralphzilla-faxa. I've come to recognize that the mosix number represents the position of the ip address of the pertinent machine in the range of ip addresses defined as representing the cluster in the mosix.map file in this fairly simple range. I'm not sure what model number assignment might follow if the cluster was comprised of machines with ip addresses in several discontiguous ranges.) Clearly, the mfs code must maintain some sort of placeholder for the missing machine that is not displayed when the machine is not present which is not displayed, but is visible to perl's readdir command. While it would be possible to trap this error, I decided that the desireability of restarting the processes on the machines to which the files were being copied was sufficient to shift the operative framework for the script's execution to remote execution.
I've written two scripts using remote execution to move the files and restart the services, one to run when using ssh and one that is meant to be used when using rsh. While the ssh version is complete, I'm not going to put it here until I've had a chance to run it. Although very similar to the following script in its general structure, it relies almost entirely on perl modules for functionality I've implemented with system calls in this version. That difference is intentional. One of perl's strengths is reflected in one of the perl community's favorite mottos, TMTOWTDI ... There's More Than One Way To Do It. As I build the scripts in this section it is my intent to vary the types of strategies the scripts employ, so you can get a notion of that variety. We'll see how effective I am. (grin)
rsh_cluster_update.pl
This script, in its entirety here, builds on what was developed in the version that attempted to simply copy the files through the mfs mount point. If you are new to programming, or extending what you know of programming into a new domain, extending from a known base a not a bad practice to follow. Start off with a simple version that you can verify works, and add the additional elements you wish to make functional. If you follow this model and have difficulty making a certain element work using a given strategy, you can generally take a different strategy on that component and plug it into the structure without rewriting the entire script. The dominant theme in programming is structures. These structures are reflected in data, in the different ways that information about items can be represented, and by patterns of actions that are expressed in a language. Language patterns are called algorithms, a word that derives from the same root as the word algebra. That root comes from the name of the Arab genius who is credited with creating the framework of what we call algebra, the essential core of mathematics, as opposed to arithmetic. Algebra is dynamic, it describes relationships in a pattern, as opposed to simply adding up a series of static numbers. When you write a program you are assembling a series of very simple steps, frequently no more complex than x+1 (add 1 to the value x), and assembling them into larger structures that themselves execute in accordance with some pattern. Both the little chunks and the large structures of little chunks are called algorithms when they execute a number of times until a desired state is reached. When an area on a screen is filled with a picture, an algorithm is generally executing to color the pixels on the screen one by one according to the input pattern that describes the picture. Algorithms are executing all around you as you exist in the world. Watching a pattern of ripples in a flowing stream is watching the sort of dynamic processes that are described by algorithms, and if you could speed it up you would see algorithms in action as a plant grows. (There are a special set of algorithms called L-systems that, when iterated, exhibit exactly the kind of branching growth that characterizes many types of plant growth. The parallel is so striking as to suggest that this type of algorithms represent the way DNA expresses itself to generate certain types of growth. Not surprisingly, similar algorithms form the basis of the manner in which computer animations draw plant growth.)
The script itself is highly documented, so the manner in which the details of its implementation are executed should be relatively easy to follow. The overall structure of the script relies on the presence of the mfs interface to control its operation. The location of the repository for the configuration files that you want to distribute and the mount point for the mfs file system are both stored as scalars (memory variables) so if you configured your cluster with different locations for one or the other you can modify those to reflect your configuration. Note that the script requires that the Socket module be installed.
The following diagram displays the structure of the script, pared to the basic elements of its execution

Since the script is extensively documented internally I'm not going to go into the details of the execution of any given command here. You can get that simply by reading the script. Instead I'll simply pull out chunks of the code and talk about what that piece is doing.
| This section is really doing housekeeping, opening modules required for the script's operation and setting the values of scalars that hold global configuration information. |
use strict; use Socket; ##mfs mount point scalar my $mfs_mount='//mfs'; ##configuration file repository path my $source_dir = '//home//mosix//'; |
| In this section I'm showing the script who's boss, grabbing the directory and reading what's in it. |
##open a directory handle to the mfs mount point opendir(MFS_HANDLE,$mfs_mount); ##array to hold the directories under the mount point and a scalar to hold the current array member as the script iterates over the array. my @mfs_dirs; my $mfs_dirs; ##read the contents of the directory handle into the array @mfs_dirs=readdir MFS_HANDLE; |
|
In this section the environment that will be needed in the loop is being created. |
##the following scalar will hold the results of mosctl commands after they are written to files by the executed command my $response; ##this scalar holds the full path version of the restart command, relative to the root of whatever machine is pertinent my $restart = '//etc/init.d//mosix restart'; ##scalar to hold the results of the test conducted during the array iteration my $testvar; ##the following 3 scalars hold the fully qualified names of the configuration files being distributed my $map_file=$source_dir.'mosix.map'; my $hosts_file=$source_dir.'hosts'; my $default_mosix_file=$source_dir.'mosix'; ##the following three scalars will hold the destination file name when they are constructed within the loop my $dest_map_file; my $dest_hosts_file; my $dest_default_mosix_file; ##this scalar will hold the binary address of the relevant host my $addr; ##array to hold the information returned by the gethostbyaddr() function my @host; ##scalar to hold the hostname, first element of the array returned by the gethostbyaddr() function my $host; |
|
This is the loop itself ... this stuff executes for each directory under the mfs mount point |
foreach $mfs_dirs(@mfs_dirs) { ##$testvar is true if the dirctory name contains a digit $testvar=($mfs_dirs=~/^\d/); ##if $testvar is true ... if ($testvar) { ...stuff below here... } |
|
If the first character of the directory name is a number, this stuff executes, making sure that the machine is up.
If the machine is not up and in the cluster, the script continues on and deletes the file in which the status information is stored |
##it is possible that since the cluster controller (re)started mosix a machine that was in the cluster is no longer there, ##for whatever reason. to prevent that from breaking the script execution, we'll check to make sure that machine is up. ##execute the mosctl command with the isup argument and the mosix number of the machine, which is what the directory ##number represents, and write the result to a file system("mosctl isup $mfs_dirs > mos_stat"); ##open a read file handle on the result of the operation above open(STATUS,'<mos_stat'); ##read the top (and only) line from the file handle into the $response scalar $response=<STATUS> ; ##if the machine is up ... if ($response=~/yes/) {...section below here ...} close(STATUS); system("rm mos_stat"); |
| This is the section in which the script actually does what needs to be done. This part executes only if the directory name we are currently looking at has an integer in the first position, and the machine with that number as a mosix number is operating in the cluster. If you are a novice programmer, it includes what is probably the most technically-challenging portion of the script, converting between forms of the internet address to determine the hostname of the target machine. The short version of what's going on here is that your machine uses a binary representation of the dot notation internet addressing we're all familiar with, and the script needs to convert the dot notation to that binary representation so the cluster controller can retrieve that hostname. I'd suggest that you consult the documentation for the socket module if your interest in this area is sufficient that you desire a deeper explanation. |
##determine the hostname of the machine by executing the mosctl command with the whois option, and write the ##response to a file system("mosctl whois $mfs_dirs > mos_host"); ##as above, open a file handle on the resultant file and read in the only line there open(HOST,'mos_host'); $response=<HOST> ; ##convert the returned address to packed binary representation $addr=inet_aton($response); ##use that representation to get the host information @host=gethostbyaddr($addr, AF_INET); ##store the first element of that array into the $host scalar $host=$host[0]; ##construct three scalars holding the destination files $dest_map_file="root@".$host.":/etc/mosix/mosix.map"; $dest_hosts_file="root@".$host.":/etc/hosts"; $dest_default_mosix_file="root@".$host.":/etc/default/mosix"; ##execute the copying operations system("rcp $map_file $dest_map_file"); system("rcp $hosts_file $dest_hosts_file"); system("rcp $default_mosix_file $dest_default_mosix_file"); ##execute the restart command system("rsh $host $restart"); close(HOST); system("rm mos_host"); |