[Home] [Credit Search] [Category Browser] [Staff Roll Call] | The LINUX.COM Article Archive |
Originally Published: Monday, 12 March 2001 | Author: Marcelo Pham |
Published to: featured_articles/Featured Articles | Page: 1/1 - [Printable] |
Writing Database Oriented Web-based Applications with Perl - Part II
In part two of his continuing series, Marcelo walks us through the process of choosing and installing a database engine in Linux.
|
Page 1 of 1 | |
Part II: Choosing and installing the DB engine in your Linux box:6. Databases for Linux There are several databases for Linux. Right now the most used databases are PostgreSQL, mySQL and Oracle. Others not so often used but compatible are mSQL, Informix and DB2. In this installment we will talk a little about each one, how to get them and which one is best tailored to your needs, assuming that you have already designed your application and outlined your database. Usually a database is installed in your Linux box and runs as a daemon process to service requests such as inquiries and updates. Each database has its own daemons and they work differently from one flavor to another. For example, Oracle uses a daemon to monitor the database (called Server Manager) and another one for each database (called listener), PostgreSQL only has one daemon for all its databases (called postmaster). These services are just regular processes running on the server in a specific port, like the httpd daemon. Usually you must shut them down with utilities that come with the database; if you kill the running processes or shut down the server without stopping the services, you may damage the database files. A database can be running on the same server as the application, or on a dedicated box. If the application is complex and the database is often updated, it'd be a good idea to separate the database and the application server to improve performance. If the application is simple you can install the database in the same server. From your code you can connect to the database regardless its location. When you separate the application and the database servers, that's called a two-tier architecture. Other architectures are called n-tier where n denotes the quantity of independent layers that your system contains. Databases save data files in different ways. For example, PostgreSQL and mySQL save them in a directory with the database name. Oracle requires you to create dedicated partitions for parts the database: the data, the programs and the C libraries. There are also utilities (shell or command line tools) with which you can do administration tasks from the console window. You'll see them in detail in each database overview. Most databases require special groups and users to be created during the installation, in order to do all the administrative tasks, so you don't have to do them as root, for security purposes. Normally you will create a specific account for the database -database superuser- and login with that account whenever you want to maintain the database. Also, let me tell you that mySQL and PostgreSQL are open source databases (under GPL), while Oracle and miniSQL (or mSQL) are not. You can download a single-user non-commercial version of Oracle 8i at the Oracle Technology Network (OTN). You must create an account, but it's free, and you can download mSQL for free if you register in their website. And, before proceeding, the following structure is utilized for each product: i) A brief history of...: this section tells you a
little bit about how the database was born Then, I finish this second part with a recommendation of which database you should use, why, and some recommended books. Enjoy-! i) A brief history of PostgreSQL: PostgreSQL is a
powerful database engine that runs in almost all Unix and Linux
flavors. ii) Where to get it: You can download it for
free, from the PostgreSQL
website or from the official ftp site. The latest release is
7.0.3 and they're working on the relelase 7.1,
which promises to have a lot of enhancements. If you're getting
Postgres from a Linux distribution and it's older version than
7.x, I highly recommend you to upgrade it since it is much more
stable and reliable. iv) Starting the daemon: As I mentioned before, you must start the Postgres daemon, called postmaster. The program that launches the daemon is called pg_ctl and is located under the Postgres bin directory. If you followed the installation instructions, you should now have a Postgres superuser (su) account. After logging in as the Postgres su you must export the environment variables so the postmaster will know where to look for programs, libraries and data. After that, you can start the postmaster with the pg_ctl program by typing: $ cd /usr/local/pgsql/bin In order to automate the exports and the postmaster startup, I'd suggest you to modify your .bash_profile script for the Postgres superuser. The postmaster will start in the default port 5432. The -D option indicates where the database and configuration files reside. These files contain information about the default port, the current postmater pid, library locations, and more. Of course, start is the parameter to start the daemon. If you do a ps -ef, you will actually see two processes running: the pg_ctl and the postmaster. For security reasons, you should never start the daemon as root. This process will take care of all the database requests, even if you are connecting to different databases, but they all should be stored under /usr/local/pgsql/data. If you ever need to shutdown the server or the database server, don't ever do a ps -ef and kill the running processes. Instead, from the Postgres superuser session, do $ ./pg_ctl -D /usr/local/pgsql/data stop This will safely shut down the services. $ cd /usr/local/pgsql/bin then type $ cd /usr/local/pgsql/data/base You will notice a new directory, called testdb. If you look under this directory, you will see all the files (those beginning with pg) that Postgres needs in order to manage the database you just created. vi) Running the database shell: Now that you've
already created the database, let's create a table. $ cd /usr/local/pgsql/bin to connect to your database. The generic syntax would
be When the db shell prompt is ready, testdb=# create table contact (Code char(10) primary key,Last_Name char(50), First_Name char(50),Address char(100),Phone char(30)); (Don't forget the last semicolon to indicate that the command has ended. You can enter several lines -for easier reading- and execute the whole statement by typing \g. You can press the up arrow to recall old commands so you don't have to type them again, in version 7.0 and higher) This will create the table contact. To continue with
the example from the Part I, create the rest of the tables with
the specified layouts.You can get help for the shell commands
by typing \h, or by inquiring the Postgres manual (you can
download it from the postgres
website documentation section) # cd /usr/local/pgsql/data/base/testdb You will notice some new files. These files are the tables that you just created from the shell. Postgres stores the tables as files under the directory with the database's name, along with all the indexes, relationships, stored procedures, etc. For example, the table files are named with <table_name>, and the indexes and primary keys with <table_name>_pkey. vii) The good side and the bad side of Postgres: viii) Comments: In my opinion, Postgres is the best open source relational database engine for developing solid web-based applications. It's fast, reliable, easy to configure, administer and maintain, and, as with any other open source software, it has worldwide support. It handles transactions, rules, triggers and even memory cursors. I am currently developing all my web-based applications with
Postgres 7.0.3. I highly recommend it if you want to implement
an application that involves complex database operations such
as transactions and masive global updates. i) A brief history of mySQL: There is not much
information about how mySQL was born. All I know is that
Michael 'Monty' Wideneus was leaded a project more than 10
years ago and he needed an application to interact with a
database. He tried miniSQL (mSQL) but after playing with it for
a while he noticed that it wasn't fast and flexible enough for
his needs. He decided to make a new SQL interface with the same
API interface mSQL uses, but with source code that could be
modified by anyone. Other collaborators enhanced and improved
on his work. It hasn't been discovered why there is a 'my'
before the SQL. ii) Where to get it: You can download mySQL for
free, from the mySQL official
website. The latest stable version is 3.23. mySQL also
comes with some Linux distributions (Mandrake, RedHat, Stormix, SuSe and TurboLinux) but versions may
vary. $ ./configure (Do a ./configure --help for syntax help). You will
be able to configure the port, the socket and the path to the
mySQL libraries and configuration files. iv) Starting the daemon: mySQL has only one daemon that services all the requests. This daemon can be started in several ways. If you want to start it manually, you can do so by typing: $ ./safe_mysqld --user=<user> & or simply $ ./mysqld & (safe_mysqld will try to determine first the best options to run mysqld). This will start the server with the parameters specified by the configure utility. mySQL usually runs in port 3306, but you can change that if you want to. You can test if the server is running with the mysqladmin utility $ ./mysqladmin version If you want to start/stop the server automatically, you can use the mysql.server script $ ./mysql.server start and add it conveniently in your Linux startup files, or place it in your database superuser ./bash_profile script. To shutdown the server manually, type: $ ./mysqladmin -u root shutdown All these programs are usually in the mySQL installation /bin directory. v) Creating a database: To create a database, you must run the mysqladmin utility. This utility does not only create databases, it also does other administrative tasks: $ ./mysqladmin create testdb will create a database called testdb. To get help in mysqladmin commands $ ./mysqladmin --help mySQL comes with other many utilities to maintain the database, you can take a look at them in the manual, Chapter 14. They're all scripts under the /bin directory. mySQL usually stores its database files under /usr/local/mysql/data. You'll see there the database and log files. vi) Running the database shell: the mysql command line tool is called -of course- mysql. You can run it by typing $ mysql <database name> When the shell prompt is ready, testdb=# create table contact (Code char(10) primary key,Last_Name char(50), First_Name char(50),Address char(100),Phone char(30))\g ('\g' denotes the end of the command. It will execute
the sentence that you just typed. You can press the up/down
arrow keys to recall old commands). vii) The good side and the bad side of mySQL: viii) Comments: mySQL is a simple and fast database
engine, supports large databases, is very popular in
web-hosting companies and easy to maintain. It's also open
source and has a lot of user groups. However, it loses some
standard SQL features. For example, it does not have
transaction capabilities (commit/rollback), the table/record
locking is quite different (it uses global variables as flags.
As a hacker, I don't like the idea of global
variables much) and is not 100% with SQL ANSI standards.
I had to review and modify all of my stored procedures for
table creation, when I moved one of my applications from mySQL
to Oracle and PostgreSQL. It was a quick job, but still I had
to go script by script and check them line by line, the column
types, the index and primary key definition syntax, etc. i) A brief history of Oracle: Back in 1977,
Larry Ellison and two partners founded Software Development
Laboratories, and decided to build a new relational
database engine, inspired by an IBM white paper. This project
was called Oracle (meaning a source of wisdom) and they
enhanced it with SQL capabilities. Soon they changed the
company name to what it is today, Oracle Corp. ii) Where to get it: You can download a single
user-single server Oracle 8i for free, from the Oracle Technology Network
(OTN). You must register but the registration is free, they
just send you invitations to seminars and news about products
from time to time. You can also buy a CD that comes with an
Administration book and additional documentation, from MacMillan Software.
Also, Oracle needs at least two mount points in your box: one for the programs and libraries, and the other for the database files. * Attention RedHat users! There is a script that you have to run during the installation called runIns.sh, located under the /install directory. For RedHat users, this script won't work, you must execute another version of runIns.sh script which is located under a different directory (it varies in every version of 8.x). Please refer to your installation documentation for further information, or go to the OTN website, but be aware of this compatibility issue. This is only a brief description of how to install Oracle. The documentation is quite extensive so I'd suggest you to read it from the book or print it from the web prior to the installation. iv) Starting the daemon: You must start two daemons
for Oracle. One of them is called the Server Manager, which
administers the main services; the other one is called listener
and satisfies all the database requests. Then, you can start the Server Manager by typing: $ dbstart and you can start the 'listener' (or Net8 listener) by
typing: $ lsnrctl start The listener will start in the default port 1521. To see all the listener parameters $ lsnrctl services and su - oracle8i -c "dbstart" (being oracle8i the Oracle superuser) There are a lot of configuration files involved in the database warm and start up, it'd be tedious to list them all in this article. I can tell you the most important ones:
You should keep an eye on them. $ lsnrctl stop v) Creating the database: To create a database,
you will need to use the GUI assistant called dbassist. First,
logout your X-Window session and login again as the Oracle
superuser. Second, from your character based Oracle superuser
session, type: and switch to the X-Window session (it may take a while to
start). You will see a GUI assistant that will guide you
through the database creation. It's quite intuitive, all you
have to do is follow the instructions. vi) Running the database shell: The database
shell for Oracle is called Sqlplus. $ sqlplus <username>/<password>@<service
name> When the db shell prompt is ready, testdb=# create table contact (Code char(10) primary key,Last_Name char(50), First_Name char(50),Address char(100),Phone char(30)) Sqlplus doesn't have a command history, so by pressing the arrow keys will only make garbage on the screen ... For further information about sqlplus, please go to the OTN website.
Oracle usually stores the database files under a specific
directory with the database name. What Oracle recommends is two
mount points, like: The database files would be located under: You will notice files with .dbf extension under both directories. The ones under /oradata are the tables you create, plus all the indexes, serializers, relationships, stored procedures, etc. vii) The good side and the bad side of Oracle: viii) Comments: Oracle is a powerful, reliable, excellent fault-tolerance database engine. eBay, Amazon and other high traffic websites use it for their database needs, that shows you how solid it is. It handles transactions, triggers, cursors, database replication and distributed databases. From my point of view, Oracle is the best database for Unix and Sun architectures out there. Unfortunately, it's quite expensive, is not open source and is very hard to configure, maintain and get free support. To be an Oracle dba you have courses that last a year or so and cost good money. I highly recommend this database engine only if you are working for a large company that can afford to buy it and hire an Oracle dba. If you're not, you should consider PostgreSQL. 10. Which database should you use?: Well, it all depends on what you want to do. If you are developing a complex application (for example, an accounting module or a distribution logistic system) I would recommend you to choose Postgres, or, if the company you are working for can afford it, Oracle. Postgres is a serious RDBMS (Relational DataBase Management System) and has almost all Oracle features, but is open source and free, and is always open to modifications if you want to customize the engine. The key for complex web-based applications are the transaction handler, the locking procedures and the triggers or rules. In a hostile high traffic environment like the Internet, connection problems, possible attacks and power outages are daily situations, and the database must be solid and robust enough to maintain integrity and avoid any kind of data corruption. Postgres' performance may go down with a considerable number of concurrent connections, but it'll always keep the database safe. Oracle, on the other hand, is so solid that the performance as well as the data integrity will remain the same no matter how many concurrent connections you have. On the down side, it is pretty tricky to configure (flexibility comes with complexity), expensive, and requires -for a serious installation- good hardware. But the wonderful thing is... if you develop an application for Postgres, you can easily migrate it to Oracle with no major modifications (the only problem would be transferring the data from Postgres to Oracle, but that's another chapter...). If you want to implement a system that doesn't require
transactions, but must have quick database response with many
concurrent connections in small or large databases, mySQL is
your best call. It's open source so you can also customize it
to fit your needs, easy to install-configure-maintain, and
stable and fast in Internet environment with large databases.
It's also safe since it supports database replication. And, you
won't have any problem getting a web-hosting or dedicated
server with mySQL support.
Another choice, if you can is to mix both databases (mySQL
and Postgres). You can design an application that uses the two
databases miniSQL is the original version of mySQL, I've never installed or used it, but it lost popularity and is not as enhanced as mySQL. O'Reilly networks just released a new website called LAMP (Linux-Apache-Mysql-Perl|PHP|Python). It has everything related to the mentioned open source technologies, you can go there to find out more about these architectures. To finish this second installment, I would suggest you to take a look at these interesting books to have a better idea when choosing your favorite database: "Oracle 8: The Complete Reference" (George B.
Koch,Kevin Loney) - Osborne/McGraw-Hill Or, you can go to phpbuilder.com and see an excellent and exhaustive comparison between mySQL and Postgres, by Tim Perdue. Summary: -Choosing the database engine carefully for your application
is a must-do before you start programming The next issue will be 'Interfacing Perl with a Database (I)'. Part three of this series will go over DBD-DBI architectures, how to get and install them, and the code to connect, retrieve and save data to the database. Marcelo "Taz" writes for the Linux.com Develop Section. He's an active member of the Florida Linux User Xchange (please visit www.flux.org), and, when not hacking, he likes to hang out with friends and travel with his soulmate Carina.
| |
Page 1 of 1 |