Document Home


Previous: Rest of the Record, and System Logging and Fail-Overs

System Logging and Fail-Overs


Most people would say that a reasonable expectation for the availability of an important computer application would be continuous availability at a basic level without regard to available support, and almost continuous availability with only a basic level of support. The data entry script, requiring only that the web server be running and the script appropriately installed, fits that model. As long as the network hardware is functional, connecting the entry stations to the cluster, the primary suspect in any system failure would be a lock up of the cluster controller / web server, or an abend of the server software. While each of these are very unlikely events, each could be resolved by restarting the machine, something that most people could handle. (Realistically, if one of the cluster machines begins to exhibit periodic lock-ups, the machine should be examined for hardware failures and probably replaced. This is a problem attendant to using surplus hardware, but hey, if you only paid 25 USD for the machine, that's not a big deal, right? That's one of the reasons I've been so involved with assuring redundancy within the system...the most critical problems would involve failure of the cluster controller or database controller. Interior cluster machines could easily be configured with the software required to allow them to be plugged into those roles with only minimal modification of system configuration files. (I'll get into procedures for doing that at a later point. The focus of this section is somewhat different.)



Removing the susceptibility of the entry script to difficulties attendant to interaction with the database server was, of course, one of the rationales for implementing that system function as a separate script. In effect, I've de-coupled that portion of the application to run on its own, and as we've seen that works quite well. However, the fact that the script is running in the background does create one problem ... it can fail in the background and no feedback other than the voluminous amount of detail recorded in the log created by the DBI trace facility will be provided to allow the context of that failure to be reconstructed. The DBI trace log is fine for detailed analysis, but it would be nice to have a log that would record the time and date when a key problem surfaces and a brief description of the nature of the problem. Such a log could be used by an educated user to resolve many if not most problems without requiring the presence of higher-level support personnel. That's the type of thing I'm going to start creating here, starting small and adding to the logging facility as we proceed. At some point it is likely that logging facilities will be added to a custom module of application object that I roll up out of what we're doing here. But that's down the road a bit, for now let's just take the next step in our journey of a thousand miles. (A soft gong can be heard in the distance as we gaze serenely at the rock-strewn path upon which we tread. Almost unconsciously, we hear the soft murmur of the steam that parallels our path.) <grin>.


First, let's look down through the script at what could go wrong. I've done a pretty good job of making sure that the script allows for graceful recovery from a failed database connection. The repository I've created both allows the script to process working files from the directory in which they are stored and store them in a format that allows for rapid insertion into the database. There is, however, no fail-over procure for the actual insert statement, nor is there any sort of procedure for appropriately dealing with problems associated with creating the repository file. Come to think of it, I don't have an error condition in the main script that could gracefully deal with errors associated with writing the recs files ... another enchancement for the next iteration.



On to the new version of the script. When I said that this incarnation of the scripts make the biggest step in complexity to this point it was this script that I had in mind. The number of lines in the script have almost tripled, and while a large number of those lines represent the internal comments in the script the size of the increase represents a fair measure of the increase in the script's complexity. This version is intended to take full advantage of the distributed file system within the cluster, with options specified for the locations of the system log and of the repository to which records will be written if the database connection fails. It could be argued that this represents overkill, that simply making a log entry to the effect that a database connection could not be established and exiting the script would be sufficient. My response would be to concede that this is a valid point, but I have two reasons for going forward. First, it has been my general experience that if you leave a potentially system-threatening hole in place there will come a time when it threatens the system. In this context that potential hole is the fact that writing the output files to their temporary storage location will eventually fill that drive space, and that would lead to the loss of data. Therefore, these steps are appropriate. Second, one of my rationales for the development of this system in the first place is to develop techniques and explain their implementation. Ultimately, it is my call, and I decided to do it this way <grin>.



A high level overview of this script would have at its heart the same elements as the previous version ... the script reads a directory of text files and stores the records contained therein into a database. In this version, however, I have added a couple of layers of top of the the base logic. In recognition of the fact that this script will be running in the background, unattended, I'm starting to add the capability for the script to respond to the state of the environment. The primary strategy underlying this approach involves specifying a set of locations for the system log and repository files, and then checking these locations for available space and the ability to write to the file before that location is accepted. The script takes a cautious, conservative approach in that if there are problems associated with either the log file or the repository file it simply shuts down, leaving the source files intact. In other words, the script is configured to not take any chances that might lead to the disappearence of entered data ... if something is interfering with the ability of the script to leave traces of its actions or to write records to a repository if the database connection fails, the script will simply quit. The underlying problems can then be addressed, and when it next runs the script will process the files in the source directory and continue on.



One aspect of the system that adds considerably to the complexity of this script is the use of the mfs file system to write the files to drives elsewhere in the cluster. This adds considerably to the overall flexibility of the system, but as standard system tools to determine the characteristics of those drive spaces do not currently function appropriately through the mfs mount point I have had to take an alternative route to that information. The resolution I have implemented involved running remote shell commands on the cluster machines in question. I will get to that in more detail shortly.



To execute these remote shell commands there are a few preparatory steps that must first be taken. The first is simply to specify the names of the machines that are used as repositories, in this case the database server, ralphzilla-raider, and the cluster controller, ralphzilla.

my $controller='ralphzilla';
my $db_server='ralphzilla-raider';
(The third location, the host on which the script is being run, need not be specified because that is implicit in the way the directory is constructed.) This particular step is not particularly necessary, as I'll explain further below, but I've left in it to this point because it made sense to me to explicitly specify the names of the machines. It could be dropped, but it is not a big deal to customize the script to the environment and in this circumstance I'm biased in favor of knowing where things are being placed. Any of the machines in the cluster could be used in this fashion: my rationale for choosing the database controller as the primary location lies in the fact that it is also the primary source of available storage in the cluster. The cluster controller and the script host are options because if the database controller has failed these two machines are almost certain to be available. (If the cluster controller isn't there the web interface will not be available and the source files not visible, and if the script host is not running we do not really have to worry about what the script is doing to do, right? The logic is kinda unassailable. <grin>)



A relevant aside at this point is to mention that this script is relatively more hard-coded than most of what I've done to this point. A good example of this is in the free_space subroutine, in which I open the file containing the output of the df() system command and explicitly step to the relevant line rather than using a pattern match to do the same thing.

		open(FREE,'/home/www/free.txt');
		$line=<FREE>
		$line=<FREE>
While I may change some of that in later versions, I doubt that I will change this specific bit of code. If you think about it a little my rationale may be obvious - the format of the output from the df() command is utterly predictable. Doing it this way offers the opportunity to readily construct an error trap triggered if the content of the line is somehow different from what it should be. While it is possible to express the same steps in a function that includes a pattern match, in my mind that would represent adopting a far more complex solution than is required to get the job done. Doing things like that adds points of failure to an application to no real end beyond gratuitous self-gratification, and that's what japhs are for. Again, this is only my opinion, but I think that elegance can be measured on levels that transcend the immediate. This particular snippet may be ugly, but it is fast, it works, and it offers ready incorporation of additional functionality. All in all, I regard such things as appropriate measures of elegance. But then, maybe you look at things differently than I do <smirk>.



In any event, as I indicated above the current script assumes that the cluster controller and database server represent appropriate targets for these files. It has been the operative strategy of the versions of this script to access filesystems not on the host running the script though the mfs mount point. Since mfs mounts the filesystems of the relevant machines in a tree keyed to the mosix numbers of the respective machines, the script has to be able to associate the machine name with its mosix number. In this version the script checks for a file named numbers in the /home/www directory, and if that file does not exist the number_config subroutine is executed to create it. The number_config subroutine executes the mosix mosctl utility with the whois parameter, which returns the mosix number of the relevant machine when it is fed the name of the host. The output fom the two mosctl commands is redirected to the numbers file. When the script reads in the number file it splits each line on whitespace and the number is pulled from the string of which it is a part, associated the result with either the $controller_num or the $server_num scalar.

my ($t11,$t21,$t31,$t41);
($t11,$t21,$t31,$t41)=split(/\s/,$line);
$line=<NUMBERS>
my ($t12,$t22,$t32,$t42);
($t12,$t22,$t32,$t42)=split(/\s/,$line);
my $controller_num=substr($t41,1,1);
my $server_num=substr($t42,1,1);
Obviously, this is one of the hard-coded sections to which I referred earlier, and the ambivalence I have expressed earlier about how much sense it makes to worry about generalizing it applies here as well. It's appropriate to note that this could be done the other way around, as it were. When the script starts dealing with the specified directory locations we could extract the mosix number, and use that with the mosctl utility to return the address of the relevant machine. If I were worried about constructing a more generalizable script, this is the route I would take.



Once I have set up a mechanism for associating potential repository and log hosts with their mosix numbers, I create two arrays that hold sets of directories; @repository_destination and @log_destination.
my @repository_destination=("/mfs/$server_num/data/bb_system/",
							'/home/www/',
							"/mfs/$controller_num/home/www/");
							
my @log_destination=("/mfs/$server_num/data/bb_system/",
					 '/home/www/');
At this point the script runs through two loops that recurse through the two arrays and check to make sure that a minimum amount of space exists on that filesystem and that the script can create a filehandle on that filesystem. (The second check would trap such circumstances as the mfs interface becoming disabled on the target machine.) As you look through these loops you will realize that the are virtually identical, and as such should be combined into one. When I initially worked through this script I intentionally left the loops independent to facilitate flexibility if I decided it appropriate to adopt different traps or have the script execute intermediate actions within the loop. As I've not found any reason to do something like that, I can safely integrate them in the next version of the script.



The loops are controlled by a scalar, $loop, that the script initializes with a non-zero value.

while ($loop)	{
	
	##grab the next element from the array
	$l_dir=shift @log_destination;
	
	##if that element is undef, it means that we've gone past the last element in the array.  since the script would have 
	##continued onward if a viable log file handle creation had been achieved (both open and with greater than 100k available
	##on the filesystem), that means the script can't create a log.  therefore, it quits.
	if (! $l_dir)	{exit;}
	
	##read the available space on the location specified in the current array element
	free_space($l_dir);
	
	$log_file=$l_dir.$log_name;
	
	##use the eval() function to either create the filehandle, or not
	$log_fh=new FileHandle($log_file,'+>');

		##if the open statement worked and there is at least ca. 98k available, store 1 to $exit, which will lead to loop exit
	if (($free>$min) and ($log_fh))	{
		
		$loop=0;
		
	}
	
}
On each iteration through the loop the next value is pulled from the array (the shift operation) into the scalar holding the directory being evaluated. If the end of the array is reached the shift operation returns undefined. In the next line the script checks the scalar holding the directory for an undef, and the script is exited if that is the case. The following file calls the free_space() subroutine, which returns the amount of disk space available on the target filesystem. (I'll get to the operation of that one in just a bit.) The script then creates a scalar holding the fully-specified target file name and creates a new FileHandle object with append rights. The use of the FileHandle module is integral to the operation of the script as it is currently written. As I discussed back in the section on cluster management scripts, an mfs filesystem mounted within the cluster has a degree of visibility to perl even when mfs connections are disabled on the target machine. During the development of this script I discovered that it is possible to successfully create standard perl filehandles and Filehandle::Deluxe objects on mfs-mounted filesystems for which mfs access is disabled, and even write to the object, as long as filehandle buffering is enabled. Obviously, the script has to know that it is actually possible to write to the target file before it is accepted as a viable destination. I was in the process of writing a subroutine to disable buffering and attempt writing to the object when I discovered that the creation of a FileHandle module object exhibits the appropriate behavior, and predicated this part of the script on its use.



The next line in the script is an if statement that checks to make sure that the amount of free space on the filesystem is greater than the defined minimum and that filehandle creation. If both of those conditions are true, the value 0 is assigned to the $loop scalar. In this manner the pessimistic prejudice of the script is enforced. There are only two ways out of the loop: either a viable filehandle is established on a filesystem with sufficient available space or the end of the array is reached and the script is terminated.



Note that the value assigned to the minimum file space requirement is arbitrary, but it should be set to a reasonable minimum. As available file space is checked on each execution of the script, all we really have to worry about is the space that is required to write repository or log records for one invocation of the script. We are currently storing a maximum of 82 bytes of entered data plus 30 bytes of delimiters per record, therefore the maximum size of any file stored (if an eight-form display screen were used) would be just short of one Kb. To put this in context, if five people were entering records of the maximum size and averaged one eight-form screen per minute (an impressive rate of entry), running the script at five minute intervals would limit the space requirement for a single invocation to less than 25 Kb. Therefore, the 500 Kb. used here as a minmum for the repository location is a generous minimum. At the same time, if the target application is on a key mount point on the target machine, such as /var, filling the system to its capacity could threaten the operation of system services, so it makes sense to allow slack. Space requirements for the log file are of course even lower; I have arbitrarily set the minimum space requirement for that location to 100 Kb.



An important component of this subroutine is provided by the free_space() subroutine, which is used to determine the amount of space free on the target filesystem. If you look to the beginning of the script you will see that I have brought the module Filesys::DiskFree into use. This was to be the lynchpin of the test for free space, and the method worked ... except it didn't return the correct result. Further investigation revealed that when used to read the characteristics of a location under the mfs mount point the module returns the amount of disk space available on the filesystem on which the root filesystem on the host is mounted. An mfs glitch. (At the risk of belaboring a point, this is the nature of the development process, and reflects the necessity of remaining flexible.) Given that I needed to determine the amount of space available, it became clear that I needed to write something to do that job. The free_space subroutine was the result.



In the first two lines of the subroutine I read the directory passed to the subroutine as a parameter and split it on the slash character ("/").
my $dir=shift @_; my ($t1,$t2,$t3,$t4)=split('/',$dir,4); As the mfs mount point is mounted under the mount point /mfs on this system, if the second element of that result is "mfs", then we have to deal with getting that value from a mosix machine. (Since this string begins with the delimiter specified for the split ("/"), split assumes that the first element is null, or empty, which is why I specify the second element. If I had specified that the split be on the substring of $dir starting with the second character, thus dropping that initial "/", the pertinent element would be the first element. Six of one, half-a-dozen of the other.)


Within that chunk of code that executes if this directory is under the mfs mount point, the first step is to assemble a scalar holding the fully-qualified file name relative to the root directory of the host on which the file resides.

	if ($t2 eq 'mfs')	{
		
			my $path='/'.$t4;

If you take a look at the way the split function is configured in the paragraph above you will see that I limited the output from the split to four elements. As a result, the fourth element in a filesystem mounted under the mfs mount point will represent the fully-specified path relative to the root directory of the filesystem on the target machine. Stick a slash ("/") on the front of it to avoid confusion and you are good to go. (Note that this would not work if you were to do something really strange and put the mfs mount point as a subdirectory of another directory. The split would not return what it is supposed to. If I wanted to allow for something like that I'd probably assemble a pattern match for the string held in the $mfs_mount scalar, anchor at that point, and split the rest of the string. But that's what you do when you do not have control over the environment in which you are operating. As you have the freedom to specify this element of that environment, you should limit the number of complexities that are introduced purely as a result of that specification. In other words, don't shoot yourself in the foot. Save the complicated solutions for the situations in which you need to use them, if for no other reason than that solutions require processing power that corresponds to their complexity, and you should save that power for the situations in which you really need it.)


Regardless, at this point I need to find out the name or address of the target machine. In this specific circumstance I use a simple if statement associating the directory numbers with the mfs numbers stored in the scalars populated from the numbers file, and assign either the name of the database server or the name of the cluster controller in the $rhost scalar.

		if ($t3==$controller_num)	{
			$rhost=$controller;
		}
		elsif ($t3==$server_num)	{
			$rhost=$db_server;
		}
As I said earlier, this could be done the other way around ... I could have traversed the mfs directory tree, executing the mosctl whois statement with the mosix numbers to which elements of the tree are tied, and getting back the tcpip addresses of those machines. Those addresses could be stored in an external file, and read in and parsed much as I have done here. I could then have associated mosix numbers and tcpip addresses in a hash to facilitate lookups. As time goes on and I continue to extend this system I may add something like that to a general utility library and apply it in this context, but for now I'll allow it to stand as it is.



Once the relevant remote host has been established, the script stores a string representing the actual command to be executed in the scalar $command.
		my $command="df $path > /home/www/free.txt";
My use of a temporary scalar in this fashion is primarily to facilitate script legibility, and in this particular situation I think that the gain in legibillity warrants the small performance penalty occassioned by the creation of the temporary scalar. The script then executes the remote shell on the specified host with the specified command,
		system("rsh $rhost $command");
which stores the output in the defined location on the script host. As described above, the script opens the file, explcitly steps to the appropriate line, and stores the relevant element to the scalar $free.



If the target filesystem is local to the script host, however, ("$t2 ne "mfs""), however, the $free scalar is assigned the output from the avail() method of the $df Filesys::DiskFree object.

		$free=$df->avail($dir);
Obviously, that is a lot easier and faster than executing a remore shell command and retrieving the results, but you gotta do what you gotta do.



Back in the main body of the script I create a new FileHandle object on the concantenation of the scalar holding the string representing the target directory being evaluated and the scalar holding the file name, the result representing the fully-qualified file name from the root directory of the script host.

	$repo_fh=new FileHandle($r_dir.$repo_name,'>>');
The instantiation of this object with create and append rights actually creates an empty target file if the target file does not already exist. (The term instantiation is object-speak for the process of creating a new incarnation of some kind of object; i.e., creating a new specific instance of that object class. Kinda makes sense, doesn't it? If you are looking at object stuff relating to whatever computer language and run into strange words you can generally get a good sense of what they mean just by thinking about it a little. One of my favorites is vivify and auto-vivify. To vivify is to "bring life to", to bring something into existence (IT'S ALIVE!). When used in object parlance, the term seems to represent the creation of the set of data structures that underlie a given object implementation ... things like hashes and arrays. Remember, the fact that you are not familiar with a term does not mean that you are not familiar with that to which the term refers. The name is not the thing named. This is especially true in growing and evolving fields in which words are invented as shorthand methods of describing bundles of actions, structures, or a combination of the two.)



After instantiating the new Filehandle object, an if statment is evaluated that will only return true if the amount of free space on the target filesystem exceeds the minimum and the instantiation of the filehandle was successful.

	if (($free>$r_min) and ($repo_fh))
In this manner the pessimistic bias of the script is enforced. The script will continue to execute only if viable locations for both the repository and log files can be accessed and sufficient space is available on those locations. Within each loop, a successful evaluation of this condition results in the $loop scalar being assigned the value 0, which results in the termination of the loop. An intermediate approoach might involved making the repository test conditional upon failure of the database connection ... I may implement that in a future version of the system.



Now one might think that this is pretty much it, that now that I know that the repository and log files can be created and written to, all I have to do is print to the filehandles, right? After all, I've written to the repository file in previous versions of the script, and adding the capability to output to a second file should be almost trivial. HA! As you will see shortly, that was not the case.



But first, let's take a quick look at the little subroutine that actually writes the error log.

sub err_print	{
	
	my $error=$_[0];	
	my @time=localtime;
	my $year=$time[5]+1900;
	my $month=$time[4]+1;
	print $log_fh $time[2].':'.$time[1].' on '.$month.'/'.$time[3].'/'.$year.':::'.$error."\n";
}
As you can see, the subroutine is pretty simple. The first line of the subroutine simply reads the string that was passed to the subroutine and stores it into a local copy of the $error scalar. 'll explain the reason I'm doing that shortly. In the next line I read the output of the localtime function into the @time array. localtime() returns a list of elements corresponding to the system time maintained on the machine, adjusted for the local time zone. In order, the elements in the array are:

seconds
minutes
hours
day of the month
month
year
day of the week
day of the year
daylight savings time flag

Two things are worthy of note here. First, the year as returned by this function uses the year 1900 as the base year 0. Therefore, representation of the year in the form most people expect requires the addition of the value "1900". Second, the month is represented as a value in the range from 0 through 11, thus requiring the addition of 1. (Do you think of January as the zeroth month?). The third and fourth lines of the subroutine create scalars that incorporate those adjustments. Following these conversions I use those elements to write a line to the log file that includes the date, time, and the error message.



Now the difficulty I ran into with this version of the insert script has to do with the fact that perl will flatten out data structures like filehandles when they are accessed from outside of the scope of the script in which they were created. This did not cause a problem with the slightly simpler version of the script that preceded this incarnation, but in this version once I descended into a subroutine my ability to reliably to reliably step through each line in a file referenced by a filehandle effectively disappeared. I spent chunks of time over a two or three day period trying workarounds like re-winding the input filehandle with the seek() command before finally conceding that revision of this portion of the script was required. If you want to get a good picture of what is happening here, start your local X-server and invoke the script under the ptkdb debugger. I've left the file recs_insert_3_orig.pl in the samples file, which uses the original filehandle method of stepping through the lines in the input files and which is set up to automatically invoke the ptkdb debugger. Set a breakpoint at line 229 and, if the database server is running, one at line 258. If the database server is not running the break point should be set somewhere around line 296, within the write_recs subroutine. Enter "<RECS>" (without the quotes, of course) in the expression box. Now click "[Run To]". When the script next pauses, you should be at line 229 and in the watch window on the right-hand side of the screen you should see the lines of the input file under <RECS>, as shown in the illustration to the left.

Now click [Run To] again, and the script should execute to the next breakpoint you set, displaying a screen such as that to the right. You will probably now see something like "ARRAY" with a number after it associated with <RECS> in the watch window, although on the first time through you may actually see the array displayed just as it was in the calling scope of the script. That's why this version of the script will not work ... at this point all that is visible is the identification of the array that holds the filehandle contents, not the array itself.



Now, as you might imagine, perl has a number of ways to go about resolving this, ranging from storing filehandles in scalars and accessing the filehandle through the scalar to passing the subroutine a reference to the filehandle. The two methods are different ways to doing the same thing, which you could think of as pointing at the remote namespace location of the filehandle array. I am sure that I'll be getting into techniques like this in other contexts, most specifically in setting up references to arrays. What I'm going to do here is adopt a somewhat different approach by explicitly reading the contents of the pertinent file into an array, and passing the array to the subroutine. This is the most basic way of passing information around within a perl application, and is well-suited to this particular circumstance. Indeed, this is the basic method of communicating between software constructs written in whatever language, whether those constructs are called subroutines, functions, or methods. If you think about it a little bit, however, you will probably be able to come up with the types of situations in which this strategy is not particularly well-suited. What I am doing here is creating a second data structure and using that in the subroutine. That's fine when the maximum size of any given input file is eight lines of modest length, but what if the input file is large? Clearly, you do not want to pass a 10 Mb. array to a subroutine. That is when you want to use a fancier form of referencing a data structure. But I'll get to that in due course.



The implementation of the new strategy begins on line 227, on which the contents of the file referenced by RECS is read into the array @recs.

my @recs=<RECS>;
The array is then passed to either the write_recs subroutine in line 232
write_recs(@recs);
or the insert_recs subroutine in line 236
elsif ($conn)	{insert_recs(@recs);}.
Within the subroutines I pull the array from the argument space with the statement
@recs=@_;.
Rather than processing each line in a file, of course, the script is now processing each element in an array. Therefore, the while (<RECS>) loops in the subroutine are replaced with foreach loops.



Really not all that big a deal, don't you think? It took me significantly longer to explain what is going on than it took me to actually make the change. There is, however, one other thing in the new version of the script that I want to point out. If you look at one of the lines that call the err_print() subroutine, such as line 279,

err_print($error);
you will note that I have started to explicitly pass the $error error message scalar to the subroutine. Note that I pull the passed message into a local (scoped to the subroutine) copy of the scalar with the command
my $error=@_[0];.
If you assign an array to a scalar the scalar will hold the number of elements in the array ... this is just the way that perl works. Despite the fact that the arguments array in this context has only one element, that element has to be explicitly referenced when assigning its value to a scalar. The command $error=@_ would simply assign the value 1 to $error.



That's about it for this incarnation of the system. In the next I will tie up a few odds and ends and tidy this script up a bit.



Document Home


Previous: Rest of the Record, and System Logging and Fail-Overs