Linux.com Article DB: Benchmarking the Linux Kernel: An Interview with Professor Randy Appleton

Linux.com: Can you tell us a little about yourself, how did you first get interested in Linux?

Randy Appleton: Ten years ago I was a graduate student at the University of Kentucky doing operating systems research by modifying SunOS (back before it became Solaris). Linux was somewhere like 0.15 or something, but we had an old PC and I was curious, so I tried it. I was able to make the same modifications of the same functionality using Linux. However, the box just sat there, since we had all these neat SunOS boxes around the lab. As far as I know, that was the first use of Linux at the University of Kentucky.

Now I'm a Professor of Computer Science at Northern Michigan University. I led a team of three students doing operating systems research (sort of a role reversal). I'm also the local Linux evangelist around here, and I'm the guy who brought Linux to our department.

Linux.com: Who else is involved in the benchmarking Linux team?

Randy Appleton: Three students: Carey Stortz castortz@nmu.edu, Kurt Payne kpayne@nmu.edu, and Joe Schmeltzer joschmel@nmu.edu. All neat kids.

Linux.com: Could you tell us a little about how and why you set out to benckmark the Linux kernels?

Randy Appleton: My goal was to provide an educational experience for my students. Also, I was curious how fast Linux really was. I always tell my students things like "a file open takes much longer than a file read" but didn't have any numbers to back this up. However, finding out the numbers is a project of the right difficulty: not too hard and not too easy.

Interestingly, doing the obvious web searches does not find many previous benchmarking results. There seemed to be a need for data, and that very much surprised me.

Finally, projects involving Linux are popular around here. The name "Linux" gets students more fired up than the name "Windows".

Linux.com: You indicate in your introduction that the kernel and the operating system must work together as complexity increases to maintain performance. IN general terms do you think Linux has done a good job of this so far?

Randy Appleton: I dunno. Probably not.

There are lots of things that an application can do to make things easy for an operating system. Here's two examples

If two applications make use of two different libraries to generate the same functionality, the OS must load and manage two sets of libraries when one would do. Right now I'm using both Gnome and KDE apps, so my RAM is filled with both Gnome and KDE libraries, even though Gnome and KDE are pretty much functionally identical. That's a waste.

When an application wants to allocate RAM, the OS will try and merge that allocation with previous allocations. However, the way the standard C library works it can be hard to do the merging. Changing either the C lib or the kernel would make it better, and the changes are known and relatively easy. Linux Weekly News has a nice writeup about this.

Linux.com: Did your study see any trends for the future of Linux?

Randy Appleton: Yes

More concern for the installed base. Which means fewer innovations which require changes to the installed base (i.e. reformatting the hard drive to install a new file system).

More device drivers (yea!)

I predict that one of Gnome or KDE will fall away (I'm guessing Gnome lives, but that's just a guess).

Linux is already very widespread in Computer Science departments America-wide. I predict that stays.

Despite Dell's announcement, I believe that Linux will come pre-installed on even more computers.

I worry about the influence of .NET I believe that Linux will never be *very* compatible with .NET, so I hope that .NET is not too important. However, if .NET becomes important, then I believe that Linux is hurting in the commercial world.

Linux.com: Do you anticipate a time when you would not recommend to somebody that they upgrade to the latest kernel?

Randy Appleton: Not really. We feel fine installing any kernel more than a week old that hasn't gotten bad press, and we depend on our servers alot. We try and wait a week, just to let others catch any obvious bugs.

We run the developmental kernels, not the production ones. We never lose data, and never crash. Life is good.

Linux.com: Did your study draw any conclusions based on applications (desktop vrs server etc). Would it be possible to say, for example, that Linux is a good performer as a server OS but other operating systems might be better for other uses?

Randy Appleton: We drew no conclusions, since we didn't test Windows.

It's a more complex question that one might realize at first. Here's an example:

Consider the fork() system call. We benchmarked it, and found that as everyone thought, it takes a long time and is expensive. That's bad, but probably unavoidable. The Apache people already knew this, and therefore go to trouble and expense to avoid forking when possible. They changed their algorithms and wrote code to avoid fork, when that programmer time could have been used for other interesting stuff. Of course they cannot avoid fork entirely. Further, no one knows how much cost they pay by avoiding it when possible. So it's hard to know the true cost of fork for something as common as Apache.

Now what about Windows? Windows doesn't have fork(), and they don't even have anything totally like fork(). So how do we compare the fork() time we measured in Linux to anything in Windows? We cannot.

Further, the difference between Linux and Windows has forced Apache and IIS to use different algorithms, so even they're hard to compare.

In general, when it is possible to compare some function that both have, Linux does well. It does not always win, but it wins often. Reading a file is an example of functionality that both operating systems have.

In real life, people don't pick operating system because of performance. That's very important to realize. People pick operating systems because of familiarity and comfort. The second criteria is functionality. Performance is way down on the list. Sad... but true.

We geeks however, can measure and care about performance for the same reason car nuts use turbochargers in a world with speed limits... because it's cool.

Linux.com: Although cross-architecture comparison is hard, at least a relative (i.e ratio of performance improvement of a given subsystem across the architectures) would perhaps be a suitable benchmark, and a very interesting one. On a related note, are any of the benchmarks used architecture specific?

Randy Appleton: Yep. Almost all benchmarks are operating system specific. Almost none are architecture specific.

Linux.com: Do you think there is a current distribution that best meets the needs of individual workstation users in terms of performance?

Randy Appleton: I believe that after installation the main distributions are all pretty much the same. However, here at Northern Michigan University we have one answer...

REDHAT!!!!!!

Around here everyone uses RedHat because when I first showed up I told them to. Now that we have lots of Linux geeks, we all still use RedHat.

The best distribution to use is the one your friends use. Then, when something goes wrong, you have someone to ask. The ability to find friends who use the same distribution that you use is IMHO the most important criteria [when choosing a Linux distribution].

As a secondary effect, most programs you download from the net will have been tested against RedHat. That's a nice feeling.

I've got nothing against Debian (or the others). It's just nice to run what the guy next door runs.

Linux.com: You have some startling results with regards to signal handling. Could you explain a little what that is and what your study concluded?

Randy Appleton: Signal handling is this:

An application tells the OS it should be notified when a 'signal' occurs.
Time passes ...
An application (the same one or a different one) or some other event triggers the 'signal'.
The OS notices, and wakes up the application in #1 with a message that the signal has occurred.

A common example is that most editors will save the current file when the user's modem hangs up. They do this by registering that they would like the 'save-file-and-quit' function of their program to be called when the modem hangs up. If the modem functions correctly, then nothing happens. If the modem does hang up, the function is called and the file is saved.

The speed of handling signals is really not that important. However, the speed at which the operating system can handle interrupts from devices is important. Interrupt speed is very hard to measure, and signal speed ought to be correlated with interrupt speed. That's why we measured signal speed.

We concluded that signal handling time has increased gradually over the last 3.5 years. This is the only performance measure we have that's gotten worse with time.

A note from Linus suggests that modern CPU's require more code to change from running one process to running another. If this is true, then some increase is unavoidable, and all operating systems must pay this cost. See http://euclid.nmu.edu/~benchmark/linus-note.html

Linux.com: Why did you choose LMBench to do your testing? Does it have any limitations you would change?

Randy Appleton: LMbech produced some obviously buggy numbers. That's OK, since we can throw out anything obviously wrong. The seemingly correct numbers were correct, and that's what's important. LMbench covers a lot of kernel functionality, which was great. That's why we picked it.

Since the source code was available, we could always look and see exactly what it was doing. The source is easily readable, so everything was nice and clear.

Overall, it worked well for us. I'd use it again.

Linux.com: Have you sent your results to Linus Torvalds? Any other exciting interactions with individuals as a result of your study?

Randy Appleton: Yep. He even read and replied! More than once! We also got mail from Rik van Riel, Larry McVoy, Andrea Arcangeli (who I met when presenting these results at a conference).

We also got mail from big.bill@microsoft.com, but that might have been faked.

I got mail from my dean, which I thought was pretty cool.

I have 96 emails from this project from others, only counting emails worth saving. Almost all the emails we received were either positive or had a constructive criticism. We received over 240 comments on the slashdot comment system, most of which were negative and unhelpful. The contrast was really strong.

editor's note: You can check out the results of this benchmark project for yourself. Complete documentation is available at http://euclid.nmu.edu/~benchmark/ , the web site.

Originally Published: Monday, 3 September 2001	Author: The Linux.com Staff
Published to: develop_articles/Development Articles	Page: 1/1 - [Printable]
Benchmarking the Linux Kernel: An Interview with Professor Randy Appleton If last week's Linux World show told us anything it was that the Linux kernel is here to stay. As any system matures and the code gets larger and more complex, reliable benchmarking and testing becomes more and more important for a variety of reasons. Professor Randy Appleton and his team at Northern Michigan University have conducted the most in-depth Linux benchmarking to date. We asked him about the project and his results.

	Page 1 of 1
Linux.com: Can you tell us a little about yourself, how did you first get interested in Linux? Randy Appleton: Ten years ago I was a graduate student at the University of Kentucky doing operating systems research by modifying SunOS (back before it became Solaris). Linux was somewhere like 0.15 or something, but we had an old PC and I was curious, so I tried it. I was able to make the same modifications of the same functionality using Linux. However, the box just sat there, since we had all these neat SunOS boxes around the lab. As far as I know, that was the first use of Linux at the University of Kentucky. Now I'm a Professor of Computer Science at Northern Michigan University. I led a team of three students doing operating systems research (sort of a role reversal). I'm also the local Linux evangelist around here, and I'm the guy who brought Linux to our department. Linux.com: Who else is involved in the benchmarking Linux team? Randy Appleton: Three students: Carey Stortz castortz@nmu.edu, Kurt Payne kpayne@nmu.edu, and Joe Schmeltzer joschmel@nmu.edu. All neat kids. Linux.com: Could you tell us a little about how and why you set out to benckmark the Linux kernels? Randy Appleton: My goal was to provide an educational experience for my students. Also, I was curious how fast Linux really was. I always tell my students things like "a file open takes much longer than a file read" but didn't have any numbers to back this up. However, finding out the numbers is a project of the right difficulty: not too hard and not too easy. Interestingly, doing the obvious web searches does not find many previous benchmarking results. There seemed to be a need for data, and that very much surprised me. Finally, projects involving Linux are popular around here. The name "Linux" gets students more fired up than the name "Windows". Linux.com: You indicate in your introduction that the kernel and the operating system must work together as complexity increases to maintain performance. IN general terms do you think Linux has done a good job of this so far? Randy Appleton: I dunno. Probably not. There are lots of things that an application can do to make things easy for an operating system. Here's two examples If two applications make use of two different libraries to generate the same functionality, the OS must load and manage two sets of libraries when one would do. Right now I'm using both Gnome and KDE apps, so my RAM is filled with both Gnome and KDE libraries, even though Gnome and KDE are pretty much functionally identical. That's a waste. When an application wants to allocate RAM, the OS will try and merge that allocation with previous allocations. However, the way the standard C library works it can be hard to do the merging. Changing either the C lib or the kernel would make it better, and the changes are known and relatively easy. Linux Weekly News has a nice writeup about this. Linux.com: Did your study see any trends for the future of Linux? Randy Appleton: Yes More concern for the installed base. Which means fewer innovations which require changes to the installed base (i.e. reformatting the hard drive to install a new file system). More device drivers (yea!) I predict that one of Gnome or KDE will fall away (I'm guessing Gnome lives, but that's just a guess). Linux is already very widespread in Computer Science departments America-wide. I predict that stays. Despite Dell's announcement, I believe that Linux will come pre-installed on even more computers. I worry about the influence of .NET I believe that Linux will never be very compatible with .NET, so I hope that .NET is not too important. However, if .NET becomes important, then I believe that Linux is hurting in the commercial world. Linux.com: Do you anticipate a time when you would not recommend to somebody that they upgrade to the latest kernel? Randy Appleton: Not really. We feel fine installing any kernel more than a week old that hasn't gotten bad press, and we depend on our servers alot. We try and wait a week, just to let others catch any obvious bugs. We run the developmental kernels, not the production ones. We never lose data, and never crash. Life is good. Linux.com: Did your study draw any conclusions based on applications (desktop vrs server etc). Would it be possible to say, for example, that Linux is a good performer as a server OS but other operating systems might be better for other uses? Randy Appleton: We drew no conclusions, since we didn't test Windows. It's a more complex question that one might realize at first. Here's an example: Consider the fork() system call. We benchmarked it, and found that as everyone thought, it takes a long time and is expensive. That's bad, but probably unavoidable. The Apache people already knew this, and therefore go to trouble and expense to avoid forking when possible. They changed their algorithms and wrote code to avoid fork, when that programmer time could have been used for other interesting stuff. Of course they cannot avoid fork entirely. Further, no one knows how much cost they pay by avoiding it when possible. So it's hard to know the true cost of fork for something as common as Apache. Now what about Windows? Windows doesn't have fork(), and they don't even have anything totally like fork(). So how do we compare the fork() time we measured in Linux to anything in Windows? We cannot. Further, the difference between Linux and Windows has forced Apache and IIS to use different algorithms, so even they're hard to compare. In general, when it is possible to compare some function that both have, Linux does well. It does not always win, but it wins often. Reading a file is an example of functionality that both operating systems have. In real life, people don't pick operating system because of performance. That's very important to realize. People pick operating systems because of familiarity and comfort. The second criteria is functionality. Performance is way down on the list. Sad... but true. We geeks however, can measure and care about performance for the same reason car nuts use turbochargers in a world with speed limits... because it's cool. Linux.com: Although cross-architecture comparison is hard, at least a relative (i.e ratio of performance improvement of a given subsystem across the architectures) would perhaps be a suitable benchmark, and a very interesting one. On a related note, are any of the benchmarks used architecture specific? Randy Appleton: Yep. Almost all benchmarks are operating system specific. Almost none are architecture specific. Linux.com: Do you think there is a current distribution that best meets the needs of individual workstation users in terms of performance? Randy Appleton: I believe that after installation the main distributions are all pretty much the same. However, here at Northern Michigan University we have one answer... REDHAT!!!!!! Around here everyone uses RedHat because when I first showed up I told them to. Now that we have lots of Linux geeks, we all still use RedHat. The best distribution to use is the one your friends use. Then, when something goes wrong, you have someone to ask. The ability to find friends who use the same distribution that you use is IMHO the most important criteria [when choosing a Linux distribution]. As a secondary effect, most programs you download from the net will have been tested against RedHat. That's a nice feeling. I've got nothing against Debian (or the others). It's just nice to run what the guy next door runs. Linux.com: You have some startling results with regards to signal handling. Could you explain a little what that is and what your study concluded? Randy Appleton: Signal handling is this: An application tells the OS it should be notified when a 'signal' occurs. Time passes ... An application (the same one or a different one) or some other event triggers the 'signal'. The OS notices, and wakes up the application in #1 with a message that the signal has occurred. A common example is that most editors will save the current file when the user's modem hangs up. They do this by registering that they would like the 'save-file-and-quit' function of their program to be called when the modem hangs up. If the modem functions correctly, then nothing happens. If the modem does hang up, the function is called and the file is saved. The speed of handling signals is really not that important. However, the speed at which the operating system can handle interrupts from devices is important. Interrupt speed is very hard to measure, and signal speed ought to be correlated with interrupt speed. That's why we measured signal speed. We concluded that signal handling time has increased gradually over the last 3.5 years. This is the only performance measure we have that's gotten worse with time. A note from Linus suggests that modern CPU's require more code to change from running one process to running another. If this is true, then some increase is unavoidable, and all operating systems must pay this cost. See http://euclid.nmu.edu/~benchmark/linus-note.html Linux.com: Why did you choose LMBench to do your testing? Does it have any limitations you would change? Randy Appleton: LMbech produced some obviously buggy numbers. That's OK, since we can throw out anything obviously wrong. The seemingly correct numbers were correct, and that's what's important. LMbench covers a lot of kernel functionality, which was great. That's why we picked it. Since the source code was available, we could always look and see exactly what it was doing. The source is easily readable, so everything was nice and clear. Overall, it worked well for us. I'd use it again. Linux.com: Have you sent your results to Linus Torvalds? Any other exciting interactions with individuals as a result of your study? Randy Appleton: Yep. He even read and replied! More than once! We also got mail from Rik van Riel, Larry McVoy, Andrea Arcangeli (who I met when presenting these results at a conference). We also got mail from big.bill@microsoft.com, but that might have been faked. I got mail from my dean, which I thought was pretty cool. I have 96 emails from this project from others, only counting emails worth saving. Almost all the emails we received were either positive or had a constructive criticism. We received over 240 comments on the slashdot comment system, most of which were negative and unhelpful. The contrast was really strong. editor's note: You can check out the results of this benchmark project for yourself. Complete documentation is available at http://euclid.nmu.edu/~benchmark/ , the web site.
	Page 1 of 1