After "Understanding Re-entrant Kernels" and
"Linux Kernel Synchronization", this forms the third
article included within the "Linux Kernel Series" being
published at Linux.com. Every reader is requested to read the
first two articles of the series, because this article explores
in more depth a few kernel features already introduced in the
earlier articles. In this article the following topics will be
covered:
- An Overview of Process Communication in Linux.
- An Overview of Pipes, FIFOs and System V IPC.
Part two of this article, to be published tomorrow, will cover:
- System V (AT&T System V.2 release of UNIX) IPC
Resources: Semaphores, Message Queues & Shared
Memory segments (implemented in terms of GNU/Linux).
- A few code examples to chew on (for the
brave-hearted!).
Please Note:
- For explanation of words such as "kernel control
paths", "semaphores", "race
conditions" and related features, please refer
to earlier articles in the series.
- All readers must note that though this article
explores the depth of the Linux Kernel, but without
the discussion of AT&T System V release of UNIX
IPC features and facilities, no discussion would ever
be complete. Thus, several System V UNIX features
will be discussed too.
- I have had used Red Hat Linux 7.1, Linux Kernel
2.4.2-2 for compiling all the code included.
In earlier articles we have already encountered some
exciting features of the Linux Kernel. This article explains how
User Mode processes can synchronize themselves and exchange data.
We have already covered a lot of synchronization topics,
especially in "Linux Kernel Synchronization", but as
readers must have noticed the main protagonist of the story there
was a "Kernel Control Path" acting within the Linux
Kernel and NOT User Mode programs. Thus, we are now ready to
discuss synchronization of User Mode processes. These processes
rely on the Linux Kernel to synchronize themselves and exchange
data.
An Overview of Process Communication in Linux
First of all, let's understand the actual meaning of:
IPC. IPC is an abbreviation that stands for Inter-process
Communication. It denotes a set of system calls that allows a
User Mode process to:
- Synchronize itself with other processes by means of
'Semaphores'.
- Send messages to other processes or receive messages from
them.
- Share a memory area with other processes.
IPC was introduced in a development UNIX variant called
"Columbus Unix" and later adopted by AT&T's System
III. It is now commonly found in most UNIX systems, including
GNU/Linux. System V IPC is more heavyweight than BSD
mmap, and provides three methods of communication: message
queues, semaphores, and shared segments. Like BSD mmap, System V IPC uses files to identify
shared segments. Unlike BSD, System V uses these files only for
naming. Their contents have nothing to do with the initialization
of the shared segment. IPC data structures are created
dynamically when a process requests an IPC Resource, i.e. a
semaphore, a message queue, or a shared memory segment. All of
these IPC Resources would be discussed in detail later on. Before
we dive deep into the subject matter, there are a few things that
I would like to explain at the very beginning. They are as
follows:
- The mechanism in which User Mode processes synchronize
themselves and exchange data is referred to as
"Inter-process Communication (IPC)" in UNIX
Systems (that includes Linux too). But in what way
exactly do terms like: Semaphores, Shared Memory and
Message Queues relate to IPC? All readers must note that
Semaphores, Shared Memory and Message Queues do relate to
IPC in a very special way, since Semaphores, Shared
Memory and Message Queues are "Inter-process
Communication Resources" or "Inter-process
Communication Facilities", and different in the way
they represent IPC from "Inter-process Communication
Mechanisms" like Pipes and FIFOs. Semaphores, Shared
Memory and Message Queues are System V (AT&T
System V.2 release of UNIX) IPC facilities,
and they represent wrapper functions that have been
developed and inserted in suitable libraries to harness
the energy and beauty of IPC mechanisms. More on this
later.
- Data sharing among processes can be obtained by storing
data in temporary files protected by locks. But this
mechanism is never implemented as it proves costly since
it requires accesses to the disk filesystem. For that
reason, all UNIX Kernels include a set of system calls
that supports process communications without interacting
with the filesystem.
Application programmers have a variety of needs that call for
different communication mechanisms. Some of the basic mechanisms
that UNIX systems, GNU/Linux is particular has to offer are:
- Pipes and FIFOs: Mainly used for
implementing producer/consumer interactions among
processes. Some processes will fill the pipe with data
while others will extract from it.
- Semaphores: Here we refer to (NOT the
POSIX Realtime Extension Semaphores applied to Linux
Kernel Threads), but System V semaphores which apply to
User Mode processes. Used for locking critical sections
of code.
- Message Queues: To set up a message
queue between processes is a way to exchange short blocks
(called messages) between two processes in an
asynchronous way.
- Shared Memory: A mechanism (specifically
a resource) applied when processes need to share large
amounts of data in an efficient way.
Another commonly used data communication mechanism in
networks, "Sockets" will NOT be discussed here since it
requires a long discussion of networking. In this article, we
will explore all the above-mentioned IPC mechanisms and System V
IPC facilities at our disposal.