Originally Published: Monday, 10 September 2001 Author: S. A. Hayes
Published to: develop_articles/Development Articles Page: 1/1 - [Std View]

An Interview with Matthew O'Keefe of Sistina Software

Somebody forgot to tell Sistina the dot.com boom was over. This start-up walked away from Linux World with two awards and plenty of interest in their technology. Intrigued, we went to find out more. Linux.com met with Matthew O'Keefe founder of Sistina Software on the show floor to ask him for a primer on network storage and how Sistina fits into the picture. At the same time we asked him about the GPL and the Aladdin license and got some very interesting answers. Read on dot-com busters and sees how's it's done.

Introduction to Sistina Software

From 1990 until May 2000, Matthew O'Keefe taught and performed research in storage systems and parallel simulation software as an associate professor of electrical and computer engineering at the University of Minnesota. Unable to find a storage solution for the complex data they were gathering, Matt's entrepreneurial vision gave way to the creation of Sistina Software. Founded in May 2000, Sistina Software creates storage clustering software for Linux, including the Global File System and the Linux Logical Volume Manager.

Committed to the development of storage management software and clustering technology, Matt has been actively involved in driving this industry forward. Matt's accomplishments include the development and demonstration of the industry's first cluster file system running over a Fibre Channel SAN at the 1997 NAB conference, the publication of over 60 technical papers, and his involvement in chairing, organizing and presenting at multiple workshops in the areas of cluster computing and storage management. He is a member of the IEEE Technical Committee on Mass Storage Systems, a Senior Member of IEEE, and he has served on the National Academy of Science panel, making recommendations on computational issues and simulations. Matt received his M.S. and Ph.D. Degrees in Electrical Engineering from Purdue University in 1986 and 1990, respectively.

Sistina's Global File System (GFS) earned Top Honors at Linux World Conference and Expo IDG World Expo, in conjunction with the Uniforum association, awarded Sistina the Open Source Product Excellence Award for Best Network Server Application for Sistina's Global File System(GFS). IDG made the announcement at LinuxWorld Conference and Expo in San Francisco on August 29, 2001.

Sistina's GFS version 4.1.1, a widely acclaimed clustered file system for Linux, allows multiple servers on a storage area network (SAN) to have read/write access to a single file system on shared SAN devices. Additionally, GFS features many high availability file system components such as journaling and failover capabilities, while providing affordable cluster configurations for companies that are running Linux applications.

Interview with Matthew O'Keefe

Linux.com: So, how old is Sistina Software?

Matthew O'Keefe: We really started full throttle as a company in May of 2000, so about a year and three months.

Linux.com: And where are you guys based?

Matthew O'Keefe: Minnesota, Minneapolis Minnesota.

Linux.com: So what's the primary focus, what do you guys do?

Matthew O'Keefe: We're focused on storage. In particular we focus on what we call storage area networking. It's a new kind of IT infrastructure, if you like, so we build components to exploit that.

Linux.com: In what way?

Matthew O'Keefe: Pretty basic idea. So, what you have is multiple servers connected across storage area networks. That's an interface that allows you to connect disk devices directly to the network. So now you have servers on one side and disks on the other, this has been the big thing in storage for the last five years or so.

Linux.com: So the disk devices are, like, big raid disks and these are Linux servers?

Matthew O'Keefe: Exactly. Sistina grew out of some work I did at the University of Minnesota when I was a professor there. Basically we had problems moving large amounts of data around, between computers, between graphics machines. The thing with SAN (Storage Area Networking) is that you can pump the data into tiered storage and then you can suck it back out again at a very high rate. This approach allows efficient sharing of large amounts of data.

From the standpoint of doing super computing -- fielding tens of terabytes -- it's just a great approach.

Linux.com: So SAN is about hardware?

Matthew O'Keefe: Yes, it's actually all about hardware. So, for example, there might be a fiber channel switch, if you go to a vendor like Brocade you can buy a fiber channel switch; that's been the dominant SAN interface for the last five years. What's happening today is that people are starting to build stuff that lets you do storage over IP, over a standard ethernet network. So a variety of start-ups are lining up in that space. SANs were usually fiber channel, but now they are moving into IP.

Linux.com: So what's different about Sistina Software's offering?

Matthew O'Keefe: What's different is that we provide a cluster file system, so it's a file system that runs on all these servers and allows you to map the storage with all of the servers synchronizing their access to the shared data store.

Linux.com: And, of course, that can be done over the network? Can you tell us a little more details about the current state of network storage?

Matthew O'Keefe: That's correct, it can be done over a network.

Once you have that architecture what you have is a storage cluster.

The advantage to that is - well think about how we build architectures today: You have the standard IT architecture with your server and the desktop. And, well, maybe you need some more horsepower for a particular kind of database or mini-server or disk and so you have to just keep going and propagating these kinds of servers and disks. To solve this what happened is EMC (a big server company) blew IBM out of the data center by building this really amazing disk array, that mimicked the IBM disk arrays, with very fast IO, and then what they did is say, "lets not just run this on IBM, lets go to an array that lets you plug in multiple different operating systems". So now we can have AIX and maybe Windows NT or Solaris and we can plug them into this one box, this big Symetrix box and we're going to to separate that box into independent volumes so this server is attached this volume, while this one here is attached to this other volume in the same box.

So now you taken all the storage in the data center and you've put it all under one physical umbrella: that's physical consolidation. EMC developed physical consolidation and it became extremely popular, and it's a really nice approach: The firststep in creating a SAN is physically consolidating the storage. It allows you to manage the storage in one place, it lets you do it more efficiently.

Linux.com: Can you give us an example of how it is more efficient than earlier client/server architecture? Matthew O'Keefe: Sure. Take, for example, a situation where I had separate server centric storage, perhaps a ten gigabyte drive and I'm using nine gigabytes of it, and lets say there is another server with a ten gig drive where I'm only using five gig. Now if I had a SAN what I could do is share those two drives and grab a gig here and, since there is five gigs left on the second disk, grab a few more from the second disk, and avoid having to add more hardware storage. It's much more efficient even for simepl examples.

Linux.com: Thanks. Tell use more about where Sistina's technology comes into this.

Matthew O'Keefe: OK, so at this point we're at what I would call the block storage part of a SAN. Wwhat Sistina has done is move that forward to say: "lets map a file system onto these blocks so all these different servers can share that storage"

Linux.com: Is it your own file system?

Matthew O'Keefe: Yes. It's something I started writing back at the University of Minnesota and then it evolved into Sistina Software, a company I formed wit h some of my students. We call it GFS.

The Academy and the Bazaar

Linux.com: How do you like the jump from academia to business?

Matthew O'Keefe: I love it! I really love it. It's alot of fun. I like academia too, but the beurocracy and raising money is really painful. Not that it isn't painful in business too. <laughs>

Linux.com: Academics move slowly, too.

Matthew O'Keefe: Exactly, yes. So, I like the academy but I'm also really enjoying business.

Linux.com: What's the open source component to Sistina's technology?

Matthew O'Keefe: Right, so we've always really seen ourselves as open source until the very latest version. The source is still available in the latest version however. We use the Aladdin public license. So, the source is available but it doesn't meet the absolute strict "Bruce Perens" GPL guidelines. What we do is we differentiate. If you are a commercial entity and you are going to resell GFS in a commercial or embedded product, or put it together with some other commercial software...

Linux.com: But individuals can still download the source and...

Matthew O'Keefe: That's correct. Most users can just use it for free irregardless of the commercial side. Linux.com: Why did you make that decision?

Matthew O'Keefe: GFS is an OEM product, it's not really a direct-for-sale type of product. What we have found is that our OEMs like the Aladdin approach because they are all on a level playing field. If it's GPL the problem is that there is an economic incentive for them to not give you any money. If their competitor gets the advantages of what they put in, then, well that's a problem for the OEM.

But for community based projects, where everybody is kicking in and sharing, then the GPL starts to make more sense, where it is fully distributed.

But for GFS, where there is a single entity, we think the Aladdin license, which is used today for GhostScript, worked successfully. He wanted GhostScript in laser printers and the folks who could do that will pay the license, but if you're just using GhostScript and you need to get at the source code, you can. We think it is a reasonable approach because you get the best of both worlds.

Linux.com: Well put. That's definitely an intriguing licensing scheme. What's next for Sistina Software?

Matthew O'Keefe: Well, we just released GFS, we are also developing a Linux logical volume manager along with a lot of the community so we have the lead developer, the maintainer and several other senior developers working on that. That's pure GPL and more of a community project so lots of people are sharing resources.

Linux.com: You maintain the CVS tree and stuff?

Matthew O'Keefe: Right, and bug tracking and, well, most of the developers do work for us. But, again, people in the community contribute serious bug fixes, they contribute serious patches and they help us out with documentation.

Linux.com: How do you maintain that balance, I mean, do you just enjoy working with an open source development paradigm?

Matthew O'Keefe: Oh, absolutely.

Linux.com: Does it save you money or time or...?

Matthew O'Keefe: No question about it in the sense that what you get in return from the community is amazing system level testing. People can download it for free and if they can get access to the source code they are very enthusiastic about putting it into tough environments. That's a good thing! Now, we're a storage company, we're not an "open source company" in the strictest sense of the term, but we see open source as an alternative licensing scheme, sometimes we use it, sometime we can't.

Linux.com: That's a common place where a lot of people are going, I think. Use whatever is best for the situation.

Matthew O'Keefe: You think so?

Linux.com: Yeah, I've seen it a lot recently.

Matthew O'Keefe: We were a little concerned we'd get some nasty flaming.

Linux.com: Well, there's always going to be flaming.

Matthew O'Keefe: <laughs>

Linux.com: So, do you have any announcements at the show?

Matthew O'Keefe: Yes, the major announcement is the 4.2 release, there are some big changes there. Some of the new features are shared volume and app support: certain applications and databases need support in that way. Other changes include improvement to the file system checker, improvement to the SAN we can do over Ethernet and a variety of other features.

By the way LVM is up for an enterprise award and GFS is also up for an award.

So we might be seriously celebrating on Wednesday!

Linux.com: Cool! Well, thank you so much with talking with us here at Linux.com. Matthew O'Keefe: My Pleasure.

Sistina's Global File System (GFS) earned Top Honors at Linux World Conference and Expo IDG World Expo, in conjunction with the Uniforum association, awarded Sistina the Open Source Product Excellence Award for Best Network Server Application for Sistina's Global File System(GFS). IDG made the announcement at LinuxWorld Conference and Expo in San Francisco on August 29, 2001.