Document Home

HTML Documents and Basic CGI Scripts


Initializing the Database

We're now ready to start getting serious with this database thing, but before we can actually start doing things with the database we've got to create it. If you haven't yet installed postgresql, do it now. You're not going to get any further until you do that. From this point we're going to assume that your database server has postgresql installed and running, and that the pg_hba.conf file is configured to allow trusted access to the database from any of the machines in the cluster. We'll start getting more strict with our authentication requirements soon. First, we're going to get the web server and the database server talking to each other.


While we do have the database server running, we haven't created any users and we don't actually have any databases yet. First, telnet to the machine with the database server, su to the root account, and then su to the postgres account, which you'll have to do to manage the server. First, we'll create just one user to deal with the database. Here, as with samba, it is not at all unusual to find a simplified authentication structure associated with web sites that connect to databases. There are a number of contexts in which you might attach to a site, and once you've logged in a variable is maintained that establishes your access privileges to the rest of the system. Obviously, you wouldn't handle things like that if you were accessing confidential bank records, but the requirements for most sites are substantially short of that. Regardless, I'm getting ahead of myself. I've called this general purpose user "baseball". To create the user, simply issue the command "createuser baseball". If you built postgresql from downloaded source, the location of the postgresql client might not be in your search path. The default installation prefix for postgresql is "/usr/local/pgsql" ... if you installed by compiling source code and didn't issue any special parameters when you typed "./configure" the best thing you can do at this point is to change directory to the "/usr/local/pgsql/bin" directory. You can obviously execute these utilities by fully specifying the path to them, but we'll be running a few of them in a row and if anything does go wrong it might be handy to actually be in the directory. While you could append the directory to your search path, these are utilities that you run only rarely, and I actually find it handy to go into the directory. There are a number of factors involved when you are firing up a database system, and going into the directory to run them helps to force me to think of them.


The createuser utility will ask you if the user will create databases and other users, you might just as well say no to both of those. Within the structure we've set up the baseball user can now access any of the databases on the system, because in the pg_hba.conf file we said that any of the machines in the cluster could be trusted, and could access any of the server's databases. But we don't have any databases there. We'll take care of that now. Type "createdb baseball" and the database will be created. (Remember, if you are in the directory in which the utilities are located you will have to preface the statement with a "./" to tell it to look in the present working directory.)


We've created the database, but all that is is a container that holds the tables and other elements associated with a database. You can now exit from the session on the database server and jump to the cluster controller. Save the table creation script, available here, to that machine and pipe it into the psql utility. For example, if I had saved the table creation script into the directory /home/ralphzilla, this statement would look like this:

psql -h ralphzilla-raider -U baseball baseball < /home/ralphzilla/bbdb_create


What's going on here is that we're starting the psql utility, telling it to connect to the server on host ralphzilla-raider as user baseball and access the baseball database. Since the bbdb_create script is just a series of sql statements, piping those statements into the utility is the same as typing them into the utility's command line. (As I've not yet discussed this explicitly, I'd best do so now. SQL is a set of statements for accessing relational databases. It is set-oriented, and has a specified standard that serves as a least common denominator for the implementations that vendors build into their products. Virtually all relational databases build in some sort of interface using what are generally supersets of the SQL specification. Here is a reference to a sample chapter from an O'Reilly nutshell book on SQL. I highly recommend this book ... it's very short, and covers several implementations of SQL, including that implemented in postgresql. The most challenging thing about SQL is dealing with the variations between vendors' implementations, and not getting lost in complex statements. This book will help you on both levels. I'll hunt up some further references and include them in the links section.)


This is really all you need to do to create the database system. If the statement fails you should note the level on which it is failing, i.e., whether the problem is associated with finding the psql utility or with connecting to the database server. If the former is the problem I would suggest making sure that the installation of the client software completed. Issue the command "find / -name "psql"" ... that will search from the root for all instances of the file name "psql". I've never had any trouble installing these tools, whether from compiled binaries or by compiling source, so I seriously doubt that you will. Far more likely is a problem connecting to the database server. If you have such a problem, the first thing I'd check is to make sure that the database server is running on the relevant machine. Just telnet to it, list the active processes ("ps ax"), and make sure that "postmaster" with the appropriate paramters is visible. The hostname from which you are connecting to the server must be visible to the database server machine. In the context of the cluster, that means that the client machine must be listed in the /etc/hosts file. This could be a potential problem on the cluster controller if there were some difficulty with the specification of the two network interfaces on that dual-homed box. However, it you can get to the machine from a workstation outside the cluster, and you can get to the database server from the cluster controller, this is probably not the problem. It is, however, worth attempting to ping the database server from the cluster controller by hostname and then turn around and perform the same operation from the database server machine back to the cluster controller. If this works, you have verified that the visibility of the hostname is not the problem. Finally, check the configuration of the pg_hba.conf file, and make sure that the one you've configured is in fact in the data directory the server is currently using. This is an easier slip than it sounds. Go through the copy in that directory and make sure that it holds the changes you think are there. It is very easy to store a copy somewhere and not realize that it is not active for the instance of the server that is currently active.


Once you have been able to run the script and create the tables within the database, however, you will have established that you can connect from one of the cluster workstations to the database server and have the database server execute the series of commands that we are sending to it. Things are starting to get pretty cool here, don't you think?



Next: HTML and Basic CGI