Originally Published: Monday, 10 September 2001 Author: S. A. Hayes
Published to: develop_articles/Development Articles Page: 1/2 - [Printable]

An Interview with Matthew O'Keefe of Sistina Software

Somebody forgot to tell Sistina the dot.com boom was over. This start-up walked away from Linux World with two awards and plenty of interest in their technology. Intrigued, we went to find out more. Linux.com met with Matthew O'Keefe founder of Sistina Software on the show floor to ask him for a primer on network storage and how Sistina fits into the picture. At the same time we asked him about the GPL and the Aladdin license and got some very interesting answers. Read on dot-com busters and sees how's it's done.

Introduction   Page 1 of 2  >>

Introduction to Sistina Software

From 1990 until May 2000, Matthew O'Keefe taught and performed research in storage systems and parallel simulation software as an associate professor of electrical and computer engineering at the University of Minnesota. Unable to find a storage solution for the complex data they were gathering, Matt's entrepreneurial vision gave way to the creation of Sistina Software. Founded in May 2000, Sistina Software creates storage clustering software for Linux, including the Global File System and the Linux Logical Volume Manager.

Committed to the development of storage management software and clustering technology, Matt has been actively involved in driving this industry forward. Matt's accomplishments include the development and demonstration of the industry's first cluster file system running over a Fibre Channel SAN at the 1997 NAB conference, the publication of over 60 technical papers, and his involvement in chairing, organizing and presenting at multiple workshops in the areas of cluster computing and storage management. He is a member of the IEEE Technical Committee on Mass Storage Systems, a Senior Member of IEEE, and he has served on the National Academy of Science panel, making recommendations on computational issues and simulations. Matt received his M.S. and Ph.D. Degrees in Electrical Engineering from Purdue University in 1986 and 1990, respectively.

Sistina's Global File System (GFS) earned Top Honors at Linux World Conference and Expo IDG World Expo, in conjunction with the Uniforum association, awarded Sistina the Open Source Product Excellence Award for Best Network Server Application for Sistina's Global File System(GFS). IDG made the announcement at LinuxWorld Conference and Expo in San Francisco on August 29, 2001.

Sistina's GFS version 4.1.1, a widely acclaimed clustered file system for Linux, allows multiple servers on a storage area network (SAN) to have read/write access to a single file system on shared SAN devices. Additionally, GFS features many high availability file system components such as journaling and failover capabilities, while providing affordable cluster configurations for companies that are running Linux applications.

Interview with Matthew O'Keefe

Linux.com: So, how old is Sistina Software?

Matthew O'Keefe: We really started full throttle as a company in May of 2000, so about a year and three months.

Linux.com: And where are you guys based?

Matthew O'Keefe: Minnesota, Minneapolis Minnesota.

Linux.com: So what's the primary focus, what do you guys do?

Matthew O'Keefe: We're focused on storage. In particular we focus on what we call storage area networking. It's a new kind of IT infrastructure, if you like, so we build components to exploit that.

Linux.com: In what way?

Matthew O'Keefe: Pretty basic idea. So, what you have is multiple servers connected across storage area networks. That's an interface that allows you to connect disk devices directly to the network. So now you have servers on one side and disks on the other, this has been the big thing in storage for the last five years or so.

Linux.com: So the disk devices are, like, big raid disks and these are Linux servers?

Matthew O'Keefe: Exactly. Sistina grew out of some work I did at the University of Minnesota when I was a professor there. Basically we had problems moving large amounts of data around, between computers, between graphics machines. The thing with SAN (Storage Area Networking) is that you can pump the data into tiered storage and then you can suck it back out again at a very high rate. This approach allows efficient sharing of large amounts of data.

From the standpoint of doing super computing -- fielding tens of terabytes -- it's just a great approach.

Linux.com: So SAN is about hardware?

Matthew O'Keefe: Yes, it's actually all about hardware. So, for example, there might be a fiber channel switch, if you go to a vendor like Brocade you can buy a fiber channel switch; that's been the dominant SAN interface for the last five years. What's happening today is that people are starting to build stuff that lets you do storage over IP, over a standard ethernet network. So a variety of start-ups are lining up in that space. SANs were usually fiber channel, but now they are moving into IP.

Linux.com: So what's different about Sistina Software's offering?

Matthew O'Keefe: What's different is that we provide a cluster file system, so it's a file system that runs on all these servers and allows you to map the storage with all of the servers synchronizing their access to the shared data store.

Linux.com: And, of course, that can be done over the network? Can you tell us a little more details about the current state of network storage?

Matthew O'Keefe: That's correct, it can be done over a network.

Once you have that architecture what you have is a storage cluster.

The advantage to that is - well think about how we build architectures today: You have the standard IT architecture with your server and the desktop. And, well, maybe you need some more horsepower for a particular kind of database or mini-server or disk and so you have to just keep going and propagating these kinds of servers and disks. To solve this what happened is EMC (a big server company) blew IBM out of the data center by building this really amazing disk array, that mimicked the IBM disk arrays, with very fast IO, and then what they did is say, "lets not just run this on IBM, lets go to an array that lets you plug in multiple different operating systems". So now we can have AIX and maybe Windows NT or Solaris and we can plug them into this one box, this big Symetrix box and we're going to to separate that box into independent volumes so this server is attached this volume, while this one here is attached to this other volume in the same box.

So now you taken all the storage in the data center and you've put it all under one physical umbrella: that's physical consolidation. EMC developed physical consolidation and it became extremely popular, and it's a really nice approach: The firststep in creating a SAN is physically consolidating the storage. It allows you to manage the storage in one place, it lets you do it more efficiently.

Linux.com: Can you give us an example of how it is more efficient than earlier client/server architecture? Matthew O'Keefe: Sure. Take, for example, a situation where I had separate server centric storage, perhaps a ten gigabyte drive and I'm using nine gigabytes of it, and lets say there is another server with a ten gig drive where I'm only using five gig. Now if I had a SAN what I could do is share those two drives and grab a gig here and, since there is five gigs left on the second disk, grab a few more from the second disk, and avoid having to add more hardware storage. It's much more efficient even for simepl examples.

Linux.com: Thanks. Tell use more about where Sistina's technology comes into this.

Matthew O'Keefe: OK, so at this point we're at what I would call the block storage part of a SAN. Wwhat Sistina has done is move that forward to say: "lets map a file system onto these blocks so all these different servers can share that storage"

Linux.com: Is it your own file system?

Matthew O'Keefe: Yes. It's something I started writing back at the University of Minnesota and then it evolved into Sistina Software, a company I formed wit h some of my students. We call it GFS.





Introduction   Page 1 of 2  >>