I'm going to launch into our application development effort by beginning with a fairly simple example, to which we will add levels of detail and enhanced tools. This is a good model to follow when starting a new project in uncharted territory, and is a good approach even when you are on more familiar ground. If you can break your applications into chunks, and implement solutions to those chunks, you can aggregate groups of chunks that you know work under a larger scale umbrella. If something about that structure doesn't work, you'll be able to isolate the problem to the construction of the umbrella or the communication between elements incorporated into the structure of the umbrella. A great secret in software that not many people outside the area realize is that frequently tools that are supposed to work in a given fashion, don't! If you find yourself running into a situation like this that causes difficulty for the structure you have in mind, and if you have done a good job constructing your chunks, you can reassemble them into an alternative structure in surprisingly short order. This is not to say that all of your efforts can or should be put together from the bottom up. Sometimes you are going to have to be able to show people a mock application with dummy screens to be able to be able to elicit sufficient feedback to determine what they want. Even in contexts like this, however, you should attempt to envision any given element of the application as an aggregate of chunks. There are many options in the way you can assemble chunks into structures, and you should give yourself as much flexibility as you can.
The illustration below identifies the primary components of an HTML document, which produces the simple sample at right.
The first line, Doctype, is primarily of interest if your document has certain special elements embedded within it of which your browser must be made aware. The section enclosed within the <head> tags is used to characterize the content of the page. Of primary interest here is the <title> tag, which sets the string that is displayed in the top border of the browser window when the page is open, and on the button for the page on the toolbar. Also frequently included in this section, although not displayed here, are the keywords and description meta-tags, Most of the search engines on the web index the terms stored in the keywords section as the basis for the results they return, and the included description is frequently displayed as the summary of the site in the returned search. If you want to see the values store in those tags for this document, return to the document home page and either right-click and select view source or select the view menu from the browser menubar and select source. That will display the html source for that page. The real meat of an html document, of course, is the annotated text that displays the page. You created a simple example of that in the last section when you created your "Hi, world" page. (You did do that, didn't you?) As you can see, the html included here is not much more complicated than that. What we're going to do next is create a simple page that enters records into the team table. It will have the most simple of browser-based database application models, a static HTML document linked to a cgi script.
In sample_1a.html were going to take what we displayed on the screen above and wrap that in a form, which allows us to define input fields and a default action for the form. The simple html document produces the screen you see to the right.
As you can see from the illustrated code, this is not a big deal. You simply define the form but including it within <form> and </form> tags, specify a few attributes for the elements of the form, and associate an action with it. Note the target specification in the form tag. That will be relevant later. Also notice the input button definition in the input tag, just after the League ID entry. Pretty easy stuff, eh?
As I develop these examples, I'll roll up the source files into zip and gzipped tar archives and like to them from the links page, as well as from the page on which they are discussed. The three archived files from this example are sample_1.html, illustrated above, sample_1a.html, illustrated to the left, and sample_1.cgi, are wrapped up in this tar file.
sample_1.cgi
As I've written the material that follows, I've found myself engaging in some extended discussion on perl and the general environment of browser-based programming. My rationale for this is that a document that presents material in context provides a different perspective than the more language-oriented resources you can find elsewhere.
Using the two together is a very powerful way to learn. Besides, I'm the one writing this, so I get to make choices like that. {grin} (Recognize that there is a difference between browser-based programming and web programming. I think that most would agree that if you've worked through everything that I will be discussing over the coming months you will have a solid foundation in web programming, probably a good bit more solid than many who are in the field today. But there are considerations in web programming that I won't be specifically addressing, or at least won't be specifically addressing for some time. There is a difference between setting up an application for managing the data for a church's baseball or basketball league that runs on a network within the church, and turning the same thing around and making it available over the web. But don't be intimidated, if you feel like creating some static web pages and putting them on a site somewhere, go ahead. What you are reading is really no more than that. What I'm launching into involves the first steps in doing some rather significant processing with the information submitted, and generating dynamic web pages with the results.)
| #!/usr/bin/perl -wT |
The first line in the cgi script is very important. I've specified the action associated with the form in the html document, and the shebang line is necessary to invoke the perl interpreter. The two options specified for perl in this example, w and T, represent warning and taint mode, respectively. Warning mode functions much as you might expect, letting you know when you've committed some infraction against good taste in perl programming. Many of the warnings issued under this mode are associated with things that are not fatal to script execution, but it is generally a good idea to address them. This is especially pertinent if you are intending to add the script into a larger structure. Many of these warnings relate to variable scoping, which is of prime importance in having components interact with each other appropriately. I'll discuss this issue in more detail as I discuss this script and as I develop a more sophisticated framework. The T option turns on taint mode, which implements stronger security by requiring that any scripts or binaries executed from the script be located within a directory included in the system's path environmental variable. Even if the full path to the file is specified it will not be executed by the script if taint mode is on. Such precautions are not of paramount importance when running on a local network, but have serious implications when you turn the application loose on the internet. If you start off doing things the right way, operating within the constraints imposed by things like taint mode will become second nature, and you will have a much easier time adapting to the requirements of writing web applications that have good security foundations.
|
|
|
##team_1.cgi ##this is a very simple perl cgi script to illustrate the basic elements of communicating between a client browser and a cgi script ##that interacts with a backend database server. ##set up requisite modules use CGI qw/:standard/; use DBI; use DBD::Pg; use strict; |
The next few lines, beginning with #, represent comments that describe the script. Such comments are generally used to identify the script and then explain what is going on throughout the script. This is considered very good form for a programmer, if only as an aid to personal memory. If someone else has to help to maintain the system, comments like this are generally considered mandatory. The scripts in this document are more extensively commented than most, but in general, as long as comments are not incoherent, it is almost impossible to have too many comments. The first active lines of the script make sure that the modules required for the script's functionality are present and their methods visible to the perl interpreter. Perl uses what is called a namespace to keep track of the various elements of its environment. When you "use" a module in perl, you are importing the references to the pieces of functionality incorporated in that module into the namespace. You will note that the line incorporating the CGI module includes "qw/:standard/". It is possible to import only certain elements of a module's resources into the namespace. In some cases, modules will define sets of resources that represent specialized functionality, in other cases you might limit the importation of elements from the module to control the population of the namespace, and in still other circumstances you may be forced to specify certain module resources, although this is relatively rare. Generally, the module's documentation covers these considerations explicitly. Perl's namespace is a relativedly advanced topic, it is certainly possible to write very sophisticated perl programs and take no more notice of the namespace than is required to use the modules. It is also possible, however, to write a piece of code that will simply not work unless you have expressly imported some element of a module's resources into the namespace. Just be aware that this is an element of the perl environment. Following the CGI module, the script installs the perl database interface, and the database driver for postgresql. The DBI module defines a set of standard hooks to which modules for communicating individual databases can be written. The DBD::Pg module is an example of such an implementation. The database driver and the DBI interface can be "used" in any order, but both must be present for the database functionality to work. By convention, the DBI interface is incorporated first, followed by whatever database drivers are being used. In fact, this convention is so widely adopted that I had thought it a requirement until I wrote this section, reflected a minute, and realized that it was not necessarily so, which led me to run a test on this script.1 After the database interface is set up, the next statement is "use strict". This references a special module that is used to instruct the perl interpreter how to compile your program. "use strict" thus becomes what is called a pragma. The strict pragma is one of the most frequently used sections of this module, telling the interpreter to force specific scoping of scalars, arrays, and procedures. forces some of the warnings returned as a result of the w switch, and represents and effective security tool for protecting the security of the namespace. If also forces the author to keep the elements of the overall system nicely partitioned in the namespace, keeping one element of the system from stomping on another element.
|
|
|
unlink '/home/www/dbitrace.log'; if -e '/home/www/dbitrace.log' DBI->trace(2,'/home/www/dbitrace.log'); |
The next two lines in the script create a session-specific dbi trace log. Although a slight drain on the execution of the script, setting up a trace log can be key when you are writing and debugging, and once the system is operational can be a quick source of diagnostic information if a functioning system simply stops working. Relative to most application development contexts, CGI is difficult to debug. This is because the various elements of the system are very loosely coupled, functioning vertually independent of each other. Where an application under the umbrella of a single environment might return unified diagnostic information, in the basic CGI model diagnostic information has to be tracked down. Effective debugging therefore becomes a matter of familiarization with the sources of diagnostic information for system components, such as the apache error log, and creating an environment in which the application leaves state information behind that can be used to debug script problems.
|
|
|
my $form; $form=new CGI; |
In the next two lines I initialize a new CGI object, which provides communication between the CGI script, a process on its own, and the form that spawned it. As I reviewed CGI documentation and books before launching into theis section I realized again that I've always had a tendency to mix procedural and object-oriented expressions in the scripts I've written, but in this script find that I've moved almost entirely to the object-oriented. Note that I've not followed the convention of naming the cgi object $q or $cgi. That's just me being me. I prefer to assign quasi-descriptive names to objects and other perl entities, largely because I believe that makes the code more readable. While that does reflect nothing more than my own personal preference, I do find the notion of conventions in perl rather odd.
|
|
|
my $host = 'ralphzilla-raider'; my $db = 'baseball'; my $driver = 'Pg'; my $user = 'baseball'; |
In these lines values are being assigned to the scalars that hold the parameters for the connection statement to the data source. ("Data source" is the standard name given to database devices, even though in this circumstance you might find it more appropriate to describe it as the data destination.) While it would not be a big deal to hard-code these connection parameters in this specific circunstance, it would make the script somewhat more difficult to modify to accomodate a change in the data source, and it is a bad habit to get into in any event. As you look down through the values assigned to the scalars you should have little difficulty recognizing what theyu mean. The use of "my" explicitly desclares the scope of these variables to be local to this script. "Big deal", I can hear you saying. "Where else are they going to go?" Well, you have a point there, but I have two responses. 1) We said we were going to develop good habits, or at least I committed you to that. (See what you get when you let me drive?) 2)We have no choice in the matter. Since we said "use strict", if we tried to get by without scoping the variables we'd get one of those funny error messages back when you clicked on the submit button.
|
|
|
my $dbh; $dbh=DBI->connect("DBI:$driver:dbname=$db;host=$host",$user); |
Once the scalars are populated, I create a database handle that associates a scalar with a connection to a database. Within the statement, the section |
|
|
##read the parameters from the cgi object into scalars my $team_id=$form->param('team_id'); my $team_name=$form->param('team_name'); my $fax=$form->param('fax'); my $league_id=$form->param($league_id); |
The next chunk of the script pulls the values that were entered into the form and stores them into a set of scalars. If you look back at the four lines bracketed by the <FORM> and </FORM> tags in sample_1a.html, you can see that the elements are named. It is by these names that the parameters are accessed. Once the values have been pulled into scalars, the can be used in the script in whatever fashion you might feel appropriate. Some find the object-oriented form of this statement a little confusing, a valid way to paraphrase what is going on here would be to say, for example, "execute the param method of the form object using the value 'team_id' and store the results in the $team_id scalar, which is local to this script.". This could seem like gibberish, but it will make more sense as you go along. The standard syntax is probably more legible in this instance: "my $team_id = param("team_id");" | |
|
##prepare the statement handle my $team_insert=$dbh->prepare("insert into team (team_id,team_name,fax,league_id) values (?,?,?,?)"); ##execute the statement $team_insert->execute($team_id,$team_name,$fax,$league_id); |
Once the scalars hold the values that are intended to be inserted into the database, the script fires off a message to the database server to insert them. If you are following along and creating the context on your own, you are now doing thin client-server computing. Impressive, eh? Your market value just went up by about $20,000 US. See, I told you to work through the examples.
Therefore, the command to insert the values into a team record (or, in SQL palance, row) would be: Executing this command via the do() method would be accomplished by the following statement: As a side note, losing track of the parentheses and quote mark at the end of that statement can lead to a fairly tedious error to track down, leading your eyes to get all glassy and start turning in their sockets. If you remember the remember the basic structure of the command, i.e., "dbh->do();", and visualize the position of the SQL command within quotes inside the parentheses, you'll find the structure easier to recognize.
|
|
|
print $form->header("text/html"), $form->start_html("reply"), $form->p("$team_name added"), $form->end_html; |
Once the script has inserted data, it's a good idea to five some indication that the operation actually occurred, at least if the script is being called from a browser page. In this example I've taken the strategy of simply opening a new window with an indication that the record has been entered. Though there are a number of ways to get that done, this type of response is simple to implement and something you've probably seen many times before. The basis for the action is set way back in the form definition for the html document. This little secion demonstates one of the most functional aspects of the CGI module, the ease with which appropriately formatted html code can be returned to the browser. As you can see in these few lines, the module has methods to return to the client just what it expects to receive in any given context. I will make extensive use of these capabilities at a later point, as we start generating web pages on the fly. (Actually, it is important to note that this actually represents one statement spread across four lines. Did you notice that the lines are separated with commas rather than semi-colons? This could have been expressed as four separate commands if "print" started each line.) | |
|
$dbh->disconnect; exit 0; |
Following the generation of the response window, I disconnect from the database and exit the script. The 0 followiung the exit statement simply sets a status value tha can be used to calling structures, as appropriate, to control their operations. |
So there you have it. Nothing you can't handle. I'll start dressing up the browser page in the next section.
Next: HTML - Just a Little Fancier
1There are a couple of other things I want to mention, although these are not particularly relevant to this script. First, the DBI interface in concert with DBI drivers is not the only way to access databases from within perl. DBI is a relatively recent development in the perl world, so there are custom interfaces that have been written for many, if not most, databases. Second, DBI is not used strictly for databases. The most interesting example I've seen of this flexibility is the DBD-Graph module, which uses the DBI interface to produce a series of charts and graphs. If you think about it a bit you'll realize that this makes a lot of sense, because the base data for graph generation could be structured just as records in a database. Maybe I'll put graphs into the framework we'll be developing down the road, so we can play with that. I wouldn't be at all surprised to see some statistical operations implemented in a similar manner.