Originally Published: Wednesday, 15 December 1999 Author: Luke Groeninger
Published to: featured_articles/Featured Articles Page: 1/1 - [Printable]

Coming Soon to a File System Near You!

As Linux gets older, parts of it get older as well, with little or no modification or updates. One of these components is the ext2 file system. With the size of hard drives quickly growing to far beyond what anyone ever imagined four or five years ago would become standard, ext2 was developed. The inherent limitations of the ext2 file system were not apparent, but as newer, faster hard drives with larger capacities appeared, some of the kludges that went into the ext2 file system have shown their ugly heads....

   Page 1 of 1  

As Linux gets older, parts of it get older as well, with little or no modification or updates. One of these components is the ext2 file system. With the size of hard drives quickly growing to far beyond what anyone ever imagined four or five years ago would become standard, ext2 was developed. The inherent limitations of the ext2 file system were not apparent, but as newer, faster hard drives with larger capacities appeared, some of the kludges that went into the ext2 file system have shown their ugly heads.

The fact that the ext2 file system has limitations is not a secret. It is inefficient when dealing with large file system sizes, it has a 2GB file limit, and relatively poor performance. While for the average user this does not matter much, in the middle and high-end server markets, this makes a big difference. In such markets database sizes are often beyond 2GB, and multiple drive arrays can make for file systems in the terabyte range. Performance degrades as the drive size goes up, and large files are (of course) slower to access.

One of the major limitations come as a result of the design of the file system driver, and as a result, the file system as well. The ext2 file system was designed so that in the case of a crash, it would be easy to determine which data was lost. The easiest way of doing this is to wait for the first writes to finish before the next ones are given to the device driver. The "synchronous metadata update" approach, as it is called, limits the amount of data that is affected by a crash, and offers decent crash recovery. The disadvantage to this is that it offers lower average performance. Because it has to wait while the previous request is finished before even buffering a drive operation, its performance drops.

So how does one get around these limitations? A better method of accessing the disk is through the use of a "journaling," a technique that allows the drive to write the data to disk using buffering. When a drive asks to modify a block, it modifies a copy in the journal instead. Only after the journal copy is written to the journal itself will the data be written to the disk. This allows undoing of changes to the file system (the new data in the journal can be overwritten), or recreation of the new data in case something goes wrong (copying the journal data into place on the drive). This technique offers better performance (data can be written sequentially into the journal, then written to the disk later) and better crash recovery.

Several projects are underway to develop a replacement for the ext2 file system with the advantages of a journaling file system. Three of them are highly notable, as they have been developing rapidly to a usable state, and all use journaling. These three are the ReiserFS Balanced Tree File System, the SGI XFS File System and the Ext3 File System. Each of these file systems have unique features and development paths, and could possibly be the successor to the ext2 file system.

ReiserFS is quite possibly the farthest along in development. Reported to be remarkably stable, it offers a different method of holding the data on the drive by using balanced trees to hold the file data as well as the file names, as well as journaling support. These features, combined with the fact that it was designed from the ground up to be fast and efficient on large drives and drive arrays make it a top-notch file system for using on servers. The disadvantages that have been reported come from compatibility problems between releases, often requiring you to reformat the entire partition just to use a new version. ReiserFS has been released under the GPL, with some commercial exceptions, and is currently for the Intel architecture only, but may be ported later.

The open source release of SGI's XFS file system has caused a lot of speculation in the developer community. SGI, in probably one of the most interesting business moves I have ever heard of, decided to support the Linux community by releasing the source code to its file system, working to help improve the Linux kernel, and releasing a lot of its own proprietary code as open source. The XFS file system is a full 64-bit journaling file system that has been used on SGI's IRIX platform for many years, and has been tested and debugged quite thoroughly on IRIX. While still in beta release, the Linux port of this file system has been making progress rapidly.

The third file system is the ext3 file system, which is designed to be an updated version of the ext2 file system currently used on almost all Linux boxes today. Currently still under development, this file system adds journaling information to the metablock of the ext2 file system, and aims for both backwards and forwards compatibility. This file system is probably best described as evolution of the ext2 file system, is reportedly fairly unstable right now, and is probably the farthest from being completed of the three file systems mentioned here.

As these three file systems get closer to completion, they will all offer comparable features and all will provide better performance than the current ext2 file system. As drive sizes and transfer rates increase, the demand for a faster file system will increase. Any of these three should do quite nicely, each offering slightly different features and performance.

Author's note: I do not recommend using any of these file systems unless you do not have a problem losing data on your drives and are either a developer or an advanced end user. These are still unstable file systems and using them could result in data loss across an entire drive, and possible damage to a hard drive. In other words, don't try this at home.

Luke Groeninger is currently attends school as a student, where he is frequently asked to work for the tech department whenever he is not busy doing schoolwork. Feel free to send questions and comments to dghost@linux.com. All flame will be forwarded to /dev/null, and possibly also laughed at.





   Page 1 of 1