Document Home

MOSIX Configuration


Installation and Mosix.map
At the base level, installation of Mosix is relatively straightforward. Simply use apt-get and install Mosix. Slightly more efficient, the command "apt-get install mosixview" will install the mosixview monitoring utility and all of the packages on which it depends, which includes Mosix. Once Mosix is installed, the key file defining the extent of the cluster for machines in the cluster is /etc/mosix.map. As the manpage for Mosix relates, the mosix.map file defines the cluster in terms of ranges of ip addresses. The manpage has a couple of examples that are worth looking at, but since you're probably more familiar with the structure of Ralphzilla by now I'll show you how the map file reflects the structure of this cluster.


The /etc/hosts file from Ralphzilla includes the following records for the machines that comprise Ralphzilla:

10.1.93.47 Ralphzilla
210.1.1.2 Ralphzilla
210.1.1.3 Ralphzilla-faxa
210.1.1.4 Ralphzilla-raider
210.1.1.5 Ralphzilla-free
210.1.1.6 Ralphzilla-b-free

The 10.1.93.x network is the local subnet of the public network on which Ralphzilla exists. (Some would argue that using public network in this sense is a misnomer, since it is seperated from the Internet by a firewall and the first section of the address clearly indicates a network designated as private. In the context of this discussion that distinction is largely semantic, the 10.1.93.x subnet connects to a network of tens of thousands of users over a geographically disperse area. Compared to five machines connected to the same hub, I'd say that qualifies as a public network. The 210.1.1.x network represents that one hub. I assigned that address arbitrarily when I started to set up the cluster. While it is not one of the address ranges defined as reserved for private networks the face that Ralphzilla is not configured to route between the two networks renders that point almost moot. The only context in which this would cause a modest problem would lie in name resolution requests from Ralphzilla itself. I can see only two areas in which that could conceivably represent a conflict: 1) apt-get operations and 2) potential confirmatory e-mails from ralphzilla to fax senders or routing incoming fax transmissions to e-mail recipients. Neither case represents a problem in this instance, because neither the debian mirrors I use nor them email gateway used by the State of Indiana are in this range. While I would assign these addresses differently if I were starting over, it is not important enough for me to go back and change them. Let this represent an object lesson to you ... to be absolutely free from the possiblility of conflict you should use addresses from those Class A network addresses defined as private. For example, given that the larger network is defined as 10.x.x.x, the cluster addresses could have been assigned from the 192.x.x.x network.


In any event, once the cluster was in place my only method of contact with the machines in the cluster was through Ralphzilla, by telnetting to Ralphzilla and from there telnetting to cluster machines. For that reason I wanted the addresses of each of Ralphzilla's network cards in its hosts file. However, since Mosix also uses the addresses in the hosts file, that complicated matters slightly for the map file. It would, at first glance, seem that the mosix.map file for a simple cluster such as this could be as basic as:

1 210.1.1.2 5

representing a mosix number of 1 for the first node in the range (Ralphzilla), the ip address of that node, and the number of machines in that range, in this case the entire cluster. If you were to start Mosix on Ralphzilla with that mosix.map file, you would get a message saying that "my address (10.1.93.47) is not in the range". This complicates the construction of the map file only slightly, in a way that you will likely find illustrative should you desire to buiild a cluster with a more complex topology. In effect, the Ralphzilla cluster includes two subnets, with only Ralphzilla on one, the other four machines on the other, with the second card in Ralphzilla defined in an alias line representing the bridge between the two. The mosix.map for the Ralphzilla cluster thus looks like this:
1 10.1.93.47 1
1 210.1.1.2 ALIAS
2 210.1.1.3 4


HOWEVER
As I was going about writing the following section, and thinking about the face that I was going to have to firewall the cluster from the external network, I realized that having the 10.1.93.47 address within the cluster just wouldn't do, because the most likely location for the firewall is between the two cards. While I'm not yet at the point of building the firewall, it was clear that as it existed my configuration was going to create problems . I started thinking about what my options would be for keeping things on the appropriate sides of the fence, and it seemed to me that there was a good chance that the only reason Mosix regarded 10.1.93.47 as the address for Ralphzilla was due to its position as the first address for Ralphzilla in the hosts file. I switched the order in which the two cards are listed in the file, and the single line version described above worked just fine. Mosix doesn't complain at all. Just goes to show, huh? I love it when a plan comes together. Can't wait to see when this one is going to bite me.

LATER: In the final analysis, I did change the ip addresses for the Mosix cluster machines. They now occupy the range from 192.1.1.2 to 192.1.1.6, and the mosix.map file reflects the change. As a little quiz for those who are in the process of learning, just what would that file now be? This is an easy one.

Mosix Defaults File
The man page for Mosix states that the method for controlling the behavior of the Mosix environment on any given machine is to write values of either 0 or 1, for off and on, to a series of files in the /proc/mosix/admin directory. While this still offers the most finely-grained control of individual machines, control of the basic characteristics of how Mosix is configured to run on that machine can be achieved by editing the file /etc/default/mosix. As you can see, these flags control whether the machine is a mosix node, whether processes are allowed to migrate to other machines, whether processes are allowed to migrate from other machines to the subject machine, and whether the MFS file system is enabled. The remaining adjustable parameters configurable within the /proc/mosix/admin directory are primarily involved with controlling the environment in relatively more unusual circumstances, such as when tuning the kernel or adjusting the number of hops the Mosix cluster will span. Note that the parameters configurable in the defaults file must be set there. The startup will overwrite values in the /proc/mosix/admin directory in accordance with what is in this file. Note that the only value that has been changed in this file is that which controls whether the MFS file system is to be used.


Mosrun
The mosrun utility is used during configuration primarily to specify which daemons should not be allowed to migrate from the machine on which they originate. The general advice regarding which processes should be restricted to their home node seems to be that daemons that establish the operation and environment of the node should probably be kept local. (The /etc/inittab file I've put together for a generic cluster box is
here.) That makes sense, if you think about it. While the cluster might be able to function if the nmbd daemon from ralphzilla-free had migrated to ralphzilla-raider, for example, it would take a context so extreme and unusual for that to represent efficient operation that the context would probably never occur, and even if it did would be fleeting. Similarly, the postmaster daemon on the database server in the cluster, ralphzilla-raider, should almost certainly be locked down to that machine. The only context I can envision in which it would make sense to allow it to migrate would be when database access is expected to be very infrequent, in which case one might allow it to migrate so the more powerful processor in that processor could be used by another process. In such a circunstance, however, it would seem more rational to put the database server on a less-powerful node and lock it down there. Regardless, these are the considerations that should be brought to bear when assessing when to lock a process to a given machine with mosrun.


The discussion above does bring up a point that I hope to address as I experiment with the cluster. That is, I'm not at this point sure whether processes like the samba daemons are required, optional, or irrelevant on the cluster nodes beyond the controller. Since the shared directory is mounted under the /mfs mount point, and visible to all machines in the cluster, my suspicion is that they are not required. Recall, however, that my strategy is to install a working system, and then pull stuff out until it breaks or slows down significantly. We'll find out the answer to some of these questions as we go along.


Once the mosix.map, /etc/default/mosix, and /etc/inittab files are configured to support your configuration, you should be able to start mosix on the machine in question by issuing the command "/etc/init.d/mosix start". The debian installation will have configured your machine so this script will also be run at system startup.


Next - MFS