Originally Published: Friday, 24 August 2001 Author: Subhasish Ghosh
Published to: develop_articles/Development Articles Page: 1/5 - [Printable]

Understanding Linux Kernel Inter-process Communication: Pipes, FIFO & IPC (Part 2)

The Linux kernel is a thing of great beauty and learning to understand and appreciate its facets and edges is a worthy and noble pursuit. Take our hand as Linux.com offers this second part of Subhasish Ghosh's look at Inter-Process Communication in the Linux Kernel. Together we will find the grok, sooner or later.

   Page 1 of 5  >>

This is part two of Understanding Linux Kernel Inter-process Communication, the first part was published yesterday. You'll probably want to read the first part, well, first.

This article will cover:

  1. System V (AT&T System V.2 release of UNIX) IPC Resources: Semaphores, Message Queues & Shared Memory segments (implemented in terms of GNU/Linux).
  2. A few code examples to chew on (for the brave-hearted!).

Please Note:

  1. For explanation of words such as "kernel control paths", "semaphores", "race conditions" and related features, please refer to earlier articles in the series.
  2. All readers must note that though this article explores the depth of the Linux Kernel, but without the discussion of AT&T System V release of UNIX IPC features and facilities, no discussion would ever be complete. Thus, several System V UNIX features will be discussed too.
  3. I have had used Red Hat Linux 7.1, Linux Kernel 2.4.2-2 for compiling all the code included.

In earlier articles we have already encountered some exciting features of the Linux Kernel. This article explains how User Mode processes can synchronize themselves and exchange data. We have already covered a lot of synchronization topics, especially in "Linux Kernel Synchronization", but as readers must have noticed the main protagonist of the story there was a "Kernel Control Path" acting within the Linux Kernel and NOT User Mode programs. Thus, we are now ready to discuss synchronization of User Mode processes. These processes rely on the Linux Kernel to synchronize themselves and exchange data.

System V IPC Facilities

In this section, we are going to look at a set of inter-process communication facilities that were introduced in the AT&T System V.2 Release of UNIX. Since all these facilities appeared in the same release and have a similar programmatic interface, they are often referred to as System V IPC. As mentioned in part 1 of this article, IPC data structures are created dynamically when a process requests an IPC resource, that is either a semaphore, or a message queue or a shared memory segment. Each IPC resource is persistent; i.e. unless explicitly released by a process, it is always kept in memory. An IPC resource may be used by any process, including those that do not share the ancestor that created the resource.

Now the question that comes up is: A particular process may require several IPC resources of the same type, so how on earth is someone supposed to identify each one of these resources? The answer is simple: Each new resource is identified by a 32-bit IPC Key, which is similar to the file pathname in the system's directory tree. In addition to the IPC Key, each newly allocated IPC resource also has a 32-bit IPC Identifier, which is somewhat similar to the file descriptor associated with an open file. But one very important point to note is: IPC Identifiers are assigned to IPC resources by the Kernel and are unique within the system, but IPC Keys can be freely chosen by application programmers. But, what does this "IPC Identifier" do? When two or more processes wish to communicate through an IPC resource, they all refer to the IPC Identifier of the resource. OK, it gets a little tricky from here on in.

When I was studying the linux kernel architecture and other associated features from a number of books, professor's notes, library manuals, online magazines, HowTo's and other such official and/or unofficial sources, I always wanted to seek the answer to one simple question, which unfortunately no one could answer. Readers must have noted that in the paragraph just above this one, I did mention "...IPC Identifiers are assigned to IPC resources by the Kernel and are unique within the system...". My question was: How on earth is an IPC Identifier computed by the Linux Kernel and how come every time it produces one, it HAS to be unique? I did manage to find the answer to this question. The answer is: In order to minimize the risk of incorrectly referencing the wrong resource, the Linux Kernel does NOT recycle IPC identifiers as soon as they become free. Instead, the IPC identifier assigned to a resource is almost always larger than the identifier assigned to the previously allocated resource of the same type. Each IPC identifier is computed by combining a "slot usage sequence number" relative to the resource type, an arbitrary "slot index" for the allocated resource, and the value chosen in the Linux Kernel for the maximum number of allocatable resources. Choosing s to represent the "slot usage sequence number", M to represent the maximum number of allocatable resources, i to represent the arbitrary "slot index", where i is "either greater than or equal to zero" but "less than M", then each IPC resource's ID is computed by the formula:

IPC Identifier = ( s X M + i )

The "slot usage sequence number" s is initialized to 0 and is incremented by 1 at every resource deallocation. In two consecutive resource allocations, the slot index i can only increase; it can decrease only when a resource has been deallocated, but then the increased "slot usage sequence number" ensures that the new IPC identifier for the new allocated resource is larger than the previous one, thus, ensuring that each time an IPC Identifier is produced (allocated to a resource), it's a UNIQUE one. As simple as that! See, I told you, understanding Linux kernel features is so easy!

   Page 1 of 5  >>