|[Home] [Credit Search] [Category Browser] [Staff Roll Call]||The LINUX.COM Article Archive|
|Originally Published: Monday, 18 September 2000||Author:|
|Published to: develop_articles/Development Articles||Page: 1/1 - [Std View]|
Unix Web Application Architectures - Part 3: Sessions, Authentication, and Databases
Session management is a significant and very fundamental issue with web applications, because HTTP is a completely stateless protocol. Each HTTP request has no relation with any other HTTP request (aside from possibly using the same TCP connection). Therefore, it's the job of the application to create this association.
With a session I mean the series of HTTP requests/replies that one user makes when visiting a site. In the example application, a session would begin when a customer logs on in the system, and end when the customer logs off or just closes his browser.
Session state is all relevant information about what the user has already done during that session. In our example application, the following session state data might be kept:
Session management is a significant and very fundamental issue with web applications, because HTTP is a completely stateless protocol. Each HTTP request has no relation with any other HTTP request (aside from possibly using the same TCP connection). Therefore, it's the job of the application to create this association. In traditional GUI programming, each screen element is represented by a GUI toolkit object that holds all the data and state of the screen element. In web applications, this data must be kept somewhere else.
Cookies are the mechanism that HTTP supports for session state keeping. Netscape's Cookie Spec gives the following general description of how cookies work:
A server, when returning an HTTP object to a client, may also send a piece of state information which the client will store. Included in that state object is a description of the range of URLs for which that state is valid. Any future HTTP requests made by the client which fall in that range will include a transmittal of the current value of the state object from the client back to the server. The state object is called a cookie, for no compelling reason.
For the official current specification, see RFC 2109. If session state is kept in cookies, then each time a new session data item is to be set, it is sent to the browser with a Set-Cookie: HTTP response header. This straightforward scheme has a number of problems that are discussed next.
Obviously, the sum of sizes of cookies set for an application can become quite large if many state variables are used. Even the size of a single cookie may need to be rather large. This may exceed the maximum cookie size limits specified in the RFC, or set by the browser implementation. More likely, it will slow things down to have a lot of session data sent in every HTTP request.
Luckily it's not necessary to keep the entire state in the client. Instead, it is sufficient that merely a session ID is kept in a cookie. The server will then fetch, using the session ID, the actual session data from a database, for example, that is running on the server. Especially without SSL, it can also be seen to improve security in some cases that no actual session data and possible secrets in it ever leave the server.
Here is a concrete example of how the mechanism works when the session ID is kept in cookie named 'our_app_session' and the session data is kept in an SQL database:
Most large applications use this kind of a mechanism. From now on, I'll assume this kind of a system is used.
Cookies can be told to keep their value after a browser is closed and restarted. This benefit is unique to cookies, and can't be achieved in any other way.
There also exists a number of problems with cookies:
8.3.1 DNS Name of the Application by a DNS Wildcard
These problems can be avoided by storing the session ID elsewhere. One mechanism, mentioned on Slashdot, keeps the ID in DNS name that the browser uses for referring the server. I have not tried to use this method, so the details may be a little off, but here goes. If your application is on domain foo.com, make a wild card DNS entry *.foo.com that points to the web server IP. When creating a new session with ID, say, 987654321, redirect the browser to address 987654321.foo.com. Due to the DNS wild card, this maps to the web server IP. The server application then looks at the domain name by which it was referred, and finds out the session ID that way.
This method has the advantage that after the redirection, no extra effort is needed to have the browser remember the ID. On the other hand, care must be taken that the browser doesn't redirect itself to the official name of the server, such as www.foo.com. The server may issue such redirection too, as Apache does for instance when one refers to a directory without trailing slash. This also binds the application tightly to the DNS of the domain, which complicates installation, and adds one more thing that can be configured wrong. I can imagine that in many situations, an organization doesn't want to make such a change to its DNS setup for sake of a single application. I also feel cautious about the effect of such a scheme to DNS and web caching mechanisms, because of the huge number of unique DNS names referred in process of using the application.
8.3.2 HTTP Request Parameter
Another mechanism is to have the client send the session ID in each HTTP request. For example, if the session ID is 987654321 and the HTTP method is GET, the request for "/cgi-bin/some_view?foo=bar" would be "GET /cgi-bin/some_view?session=987654321&foo=bar HTTP/1.1" (followed by other request headers), and analogously with other methods.
This ID must be included in every HTTP request made for the application. This is an added complication over using cookie or the method described in previous paragraph. If this can be semi-automated, it's quite doable however. All the links and ACTION attributes of FORM elements must be generated with wrappers that add the session ID field into them, and that's about it. Of course the session will be lost if the user leaves the site and comes back later, or if the user tries to navigate the application by entering URL's manually.
Having the request be consistently of form /cgi-bin/some_view?parameters, or perhaps hierarchically /cgi-bin/view_group/some_view?parameters, has the additional advantage that web server log analyzers such as Analog can be used for creating usage statistics for different views, and in case of hierarchical view names, for different view groups. The log analyzers are able to ignore the parameters, ie. everything after and including the question mark.
8.3.3 Request URL Before View Name
The ID can also be put into a part of URL that comes before the view name. Using this method, a request might be "GET /cgi-bin/987654321/some_view?foo=bar HTTP/1.1". The CGI program will find out the session ID by looking at CGI PATH_INFO environment variable. If the application then only uses relative URL's for referring other views, the session ID stays automatically in the URL. For example if the view given above has a link "other_view?a=b", then the browser will generate a link pointing to "/cgi-bin/987654321/other_view?a=b". If views are hierarchically named, this requires some more care, but is still doable. For instance if a view "/cgi-bin/987654321/admin/first_view" wants to refer to "/cgi-bin/987654321/customer/second_view", it must use a link "../customer/second_view". This means a link to a view must be different depending on where the link occurs, which isn't very nice at all. Therefore it may be best to not use hierarchical view names when using this method of keeping track of session ID. It also requires playing some strange games with the web server configuration to make this work. While I haven't actually tried this mechanism in practice, I believe it can be made to work fine.
Usually sessions end due to the user simply closing his browser, pointing it into a different site, or even his browser crashing. There's no way to know when this has happened. Even if the application has a "log out" button, and users are specifically told to use it, they often won't. Hence, it is necessary to simply assume that a session has ended when it has been inactive for long enough. This implies that the session must have a "last accessed" timestamp that is updated every time a request for that session is made, or at least often enough. This may be in conflict with attempts to execute the requests in parallel as explained in section 10 Using a SQL Database as a Backend.
The time after which an unused session is removed shall be called expiry time. How long this time is depends heavily on the application. For a banking application where security is the first priority and access control is bound into the session, this might be as low as 15 minutes. For a different application which is used only in a trusted intranet, and that is essential part of a user's job, the user may want to leave his browser open when going away for a holiday, and resume his session when coming back. In this case the expiry time might be a week or even a month.
If the expiry time is long and typical session length is short, there may be a large number of inactive sessions in the database, and each of them may be large. This is not a big problem with reasonable indexing. The expiry is best done relatively rarely, because it may take a while.
As much of the session state as possible should be kept in the session state record. In web applications, all too much of the state tends to be in the links and forms of the currently shown HTML page or pages. This is usually a bad thing, because it's difficult to figure out the state of the program when it's scattered all over a changing set of HTML pages.
A particularly ugly habit is to keep the state in parameters of a chain of HTTP requests. In that situation the HTTP requests get long and complex, and it's very easy to make a typo in them. This also obscures the interface and purpose of individual requests.
To make sure the programmer actually uses the session state mechanism, it should be as easy to use as possible.
In my experience, it's hard to plan what state variables will be needed. When changes are made and new features are added into the application, more and more state variables are needed. For this reason it should be easy to create new state variables. This is certainly not the case for example if the session record is kept in a database, and each state variable is stored in a column of its own in the session table.
Access control means limiting who or what is allowed to use the application. Authentication, also called identification, means figuring out who is using the application. Often these concepts are interconnected. For example, if access control is based on the information about who is using the application, then access control can't be performed without authentication. On the other hand, controlling access by client IP address can be considered access control without authentication.
HTTP protocol includes optional authentication, currently specified by RFC 2617. This is the mechanism that makes the browser open a dialog window asking for username and password. Web servers support this mechanism by for instance logging the authenticated username into web server logs and by allowing access control based on the authentication (for example: "this page can only be viewed by user johndoe.")
Because web servers support it, HTTP authentication is a very convenient mechanism to use. In particular, it's vastly easier than any other method for controlling access for static data or several unrelated entities served by the web server.
HTTP authentication doesn't allow much control over the user interface: the dialog that prompts for username and password can't be significantly customized. It can take some effort to make sure the authentication is prompted for only in right places, at the right time, and that the action taken when the user doesn't enter correct credentials (username and password) is intuitive. At least in Apache, the credentials are case sensitive, which may confuse users.
Once a user has authenticated himself as one user, it is difficult to allow him to re-log on as a different user. If a web server is used for performing the access control, the only way to allow logging in as a new user is to make the web server deny access as the current user, which makes the web server re-prompt for the credentials. This can look confusing to the user. It also can be a little ugly operation to perform, since it requires fine-tuning the web server access control settings on the fly.
If and only if the requested resource (the URL) requires authentication, the CGI environment includes a variable REMOTE_USER whose value is the username the user has entered. If a user views a page that doesn't require authentication, REMOTE_USER isn't set. In this situation, the application has no way of knowing if a user has ever logged in as any user. Often it would be useful to know this. For example, it may be desirable that an administrator user can view all the pages in the system, but some of the pages have additional features for admins. In the example application, an unauthenticated page that shows the company's product list, one product per line, could include an "edit" button next to each product name if an admin user is viewing the page. This is not possible with HTTP authentication.
For reasons mentioned above, many large applications don't use the HTTP authentication, but instead implement authentication themselves. Implementing authentication has many similarities with implementing session functionality. The authentication record can be similarly split into ID and data portions, and those can be stored in similar ways as session record ID and data. The major difference is that the authentication record can't be created automatically, but the user must enter his credentials first.
It is a natural idea to tie authentication to the session. This can be done simply by:
When designing such a scheme, it must be kept in mind that the application may include pages that don't require authentication, but do require having a session. If that is the case, a session must be possible to establish without the user entering credentials. For maximum flexibility, the system should then:
It can be more work than it seems to get this right. Such a scheme replaces not only the HTTP authentication, but also that of some access control system, such as that of Apache's mod_auth module. It might be worthwhile to check if a suitable library already exist before starting to implement this on your own.
Many web applications serve as a front end for some data store. They allow fetching and/or changing the data in an intuitive manner. From this perspective, the data store can be called the backend of the application. When there is only a small amount of data, it is feasible to keep it in a simple ad hoc structure that is manually locked as needed, if needed.
As the requirements for data manipulation grow due to larger amount of data, higher requirements for parallelism or reliability or a need to manipulate the data in complex ways, it makes no sense to re-invent the wheel yourself. Instead, a "real" database should be used. Usually this is a SQL database. Other types of just as real and featureful databases exist, for instance exclusively object-based databases, but I know little of those and they are currently much less commonly used. An SQL database is always a relational database. I'll use these terms as synonyms for each other. I assume the reader knows the usual features of SQL databases, and will only consider the features most relevant when building web applications.
The error recovery strategy of an application can be built around an SQL transaction, if all the data of the application is stored in the database (and the database supports transactions with rollback). The nature of HTTP requests is to normally trigger an operation that takes a fairly short time to finish: usually changing or showing some data record. Therefore one can write an application framework that automatically starts a transaction upon reception of a HTTP request. All the operations that change the database are done inside the transaction. If anything fails, the code processing the request returns an error code, and the transaction is rolled back. Because all the persistent data of the application is in the database, the effects of all operations done while processing the request before the error occurred disappear. If all goes well, the transaction is automatically committed in the end of the request processing before returning the HTTP response.
This is a strong guard against programmer errors, and one can be fairly confident that the data store doesn't get corrupted. This can also simplify programming. The programmer doesn't need to concern himself with the order things are done in, as far as error recovery is concerned.
Handling multiple simultaneous HTTP requests in parallel is tricky, because more than one of them may read or write the same data. Therefore some locking must be used. This can be a very involved problem, and affect the entire design of an application. Using a good database can ease this task considerably by offering well understood and full featured mechanisms for implementing the locking. Databases typically also take care of such things as deadlock detection, which is far from trivial to implement oneself.
10.2.1 Locking of Session Records
Let's take an example of the kind of problems faced when handling many requests simultaneously. Assume an application which keeps session state in a database as suggested earlier. Assume also that the application framework automatically starts a transaction when starting to handle a HTTP request. Finally assume the application uses multiple frames. HTTP requests for all frames of a frameset are sent quite simultaneously by the browser, and all of the requests belong to the same session. Therefore the session record gets locked by the requests at the same time, and a conflict occurs if more than one of the requests modify the session record. In this case, perhaps the session data should be manipulated through a separate database connection, which doesn't start a transaction automatically. Or maybe it's not necessary to process requests of the same session simultaneously.
Using an SQL database requires creating a well defined database structure with detailed definition of each data item type. Constrains, assertions etc. further help to describe the structure of the data. This can be a powerful way of communication between members of a development team, because many people can be expected to understand the descriptions of SQL database structure. It can also be of much assistance to a team of one person only. The database structure is at all times a definitive, complete reference about all data in the application. This is a benefit that shouldn't be belittled. With an ad hoc data store, these things would probably be complicated to the extent of being useless.
Often the user of an application wants to get unexpected reports or summaries from the data in the database. It may not make sense to write a pretty interface for rare needs. In that situation it's very convenient to be able to write a suitable SQL query in a few minutes. And there is no such query that can't be written in SQL.
When interfacing with the outside world, having data in an SQL database makes things easy. It's commonly necessary to import data from one system to another. If the source application keeps its data in an SQL database, doing this is easy. Almost any programmer can be expected to be able to perform a query that returns the data he wants, even if he's not very familiar with the application is question.
Copyright (c) 2000 by Samuli Kärkkäinen <firstname.lastname@example.org>. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/).