Linux.com Article DB: Understanding Linux Kernel Inter-process Communication: Pipes, FIFO & IPC (Part 1)

An Overview of Pipes & FIFOs

In this section, I would like to discuss in minute detail two inter-process communication mechanisms, "Pipes" first, and then "FIFOs" later on. Readers must try to note the difference between an "Inter-process Communication Mechanism" and an "Inter-process Communication Resource/Facility", though it is very difficult to draw a line between them and differentiate between them. Pipes and FIFOs are "Inter-process Communication Mechanisms" while semaphores, message queues and shared memory segments are "Inter-process Communication Resources". The best way to remember the difference between the two is: Inter-process Communication Mechanisms emphasize "how and why" data communication occurs between two User Mode processes, while on the other hand, Inter-process Communication Resources define the same objective, but in a more polished manner, by implementing the functionality through programming interfaces (and most of the times, using rather complex ones!). This is the reason why a discussion on pipes and FIFOs is incomplete without a discussion on semaphores, message queues and shared memory segments.

a) Pipes: Let's start with Pipes first of all. Pipes are an inter-process communication mechanism that is provided in all flavors of UNIX. A "pipe" defines one-way flow of data between processes. All data written to a pipe by a program is routed by the Kernel to another process, which can then access it and read the data. In UNIX command shells, pipes can be created by means of the '|' operator. For example, consider this shell command:

# cmd1 | cmd2

The shell arranges the standard input and output of the two commands as follows:

The standard input to cmd1 comes from the terminal keyboard.
The standard output from cmd1 is fed to cmd2 as its standard input.
The standard output from cmd2 is connected to the terminal screen.

What the shell does here is: It reconnects the standard input and output streams so that data flows from the keyboard input through the two commands and is then output to the screen. This is how pipes function. Okay, now that we now what exactly a pipe is and have an idea how it works, the next big question is: How on earth do we create a "Pipe" programmatically on a Unix system?

On Unix systems, pipes may be considered open files that have no corresponding image in the mounted filesystems. A new pipe can be created by means of the pipe() system call, which returns a pair of file descriptors. The process can read from the pipe by using read() system call with the first file descriptor, and write into the pipe by using the write() system call with the second file descriptor. Now, pipes can be implemented in different ways on different systems. POSIX defines only "half-duplex" pipes. In "half-duplex" pipes, the pipe() system call does return two file descriptors, but each process must close one before using the other. Thus, if a two-way data flow is required one must use two different pipes by invoking the pipe() system call twice. This is how "half-duplex" pipes work. On other Unix systems, such as System V Release 4 (SVR4) Unix, pipes are implemented in a "full-duplex" manner. Full duplex allows both descriptors to be written into and read from at the same time. GNU/Linux, on the other hand, implements pipes in another unique manner. On Linux systems (that is GNU systems with Linux as the core - Kernel), pipe's file descriptors are one-way, but it is NOT necessary to close one of them before using the other. In this article, we will be dealing with POSIX style "half-duplex" pipes. (Reason: Linux uses "half-duplex" pipes but in a special way. Thus, "half-duplex" pipes will be covered.) Okay, enough said about "pipes"! Let's get going and see how we can create pipes on Unix Systems (POSIX style implemented) programmatically. The pipe function has the prototype:

#include <unistd.h> int pipe (int file_descriptor[2]);

pipe is passed (a pointer to) an array of two integer file descriptors. It fills the array with two new file descriptors and returns a zero. On failure, pipe system call returns -1. The errors defined in Linux man pages are:

EMFILE : Too many file descriptors are in use by the process.
ENFILE : The system file table is full.
EFAULT: The file descriptor is not valid.

The point to note here is that two file descriptors are returned and though distinct, are connected in a special way. Any data written to file_descriptor[1] can be read back from file_descriptor[0]. The data is processed in a first in, first out basis, usually referred to as FIFO. This means that if you write the bytes 2, 3, 4 to file_descriptor[1], reading from file_descriptor[0] will produce exactly 2, 3, 4. Readers must note: This is entirely different from the operation of a stack, which works in a last in, first out (LIFO) basis. The real advantage of pipes comes when one wishes to pass data between two processes. In the program given below the program creates a pipe using the pipe system call. It then uses fork call to create a new process. If the fork call is successful, the parent writes data into the pipe, while the child reads data from the pipe. Both parent and child processes exit after a single write and read. The readers must note that if in case the parent exits before the child, they might see the shell prompt between the two outputs. The source code for our program prog1 is as given below:

/* Pipes across a fork: By: Subhasish Ghosh Date: 15th August 2001 Place: Calcutta, WB, India E-mail:subhasish_ghosh@linuxmail.org*/

#include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h> int main() {

int data_processed; int file_pipes[2]; const char some_data[] = "123"; char buffer[BUFSIZ + 1]; pid_t fork_result; memset(buffer, '\0', sizeof(buffer)); if (pipe(file_pipes) = = 0) { fork_result = fork(); if (fork_result = = -1) { fprintf(stderr, "Fork Failure"); exit(EXIT_FAILURE); } if (fork_result = = 0) { data_processed = read(file_pipes[0], buffer, BUFSIZ); printf("Read %d bytes: %s\n", data_processed, buffer); exit(EXIT_SUCCESS); } else { data_processed = write(file_pipes[1], some_data, strlen(some_data)); printf("Wrote %d bytes\n", data_processed); } } exit(EXIT_SUCCESS); }

After typing it in, save the file and compile it using:

#cc -o prog1 prog1.cand then execute it using: #./prog1

The output is as given below:

Wrote 3 bytes Read 3 bytes: 123

Originally Published: Thursday, 23 August 2001	Author: Subhasish Ghosh
Published to: develop_articles/Development Articles	Page: 2/5 - [Printable]
Understanding Linux Kernel Inter-process Communication: Pipes, FIFO & IPC (Part 1) In this article, part one of a two part article, the prolific and talented Subhasish returns to give Linux.com readers another trip into understanding Linux kernel behavoir and programming. There's a lot of information covered here for free, so hang up your hat and have fun. Part 2 of Understanding Linux Kernel Inter-process Communication will be published tomorrow.

	<< Page 2 of 5 >>
An Overview of Pipes & FIFOs In this section, I would like to discuss in minute detail two inter-process communication mechanisms, "Pipes" first, and then "FIFOs" later on. Readers must try to note the difference between an "Inter-process Communication Mechanism" and an "Inter-process Communication Resource/Facility", though it is very difficult to draw a line between them and differentiate between them. Pipes and FIFOs are "Inter-process Communication Mechanisms" while semaphores, message queues and shared memory segments are "Inter-process Communication Resources". The best way to remember the difference between the two is: Inter-process Communication Mechanisms emphasize "how and why" data communication occurs between two User Mode processes, while on the other hand, Inter-process Communication Resources define the same objective, but in a more polished manner, by implementing the functionality through programming interfaces (and most of the times, using rather complex ones!). This is the reason why a discussion on pipes and FIFOs is incomplete without a discussion on semaphores, message queues and shared memory segments. a) Pipes: Let's start with Pipes first of all. Pipes are an inter-process communication mechanism that is provided in all flavors of UNIX. A "pipe" defines one-way flow of data between processes. All data written to a pipe by a program is routed by the Kernel to another process, which can then access it and read the data. In UNIX command shells, pipes can be created by means of the '\|' operator. For example, consider this shell command: `# cmd1 \| cmd2` The shell arranges the standard input and output of the two commands as follows: The standard input to `cmd1` comes from the terminal keyboard. The standard output from `cmd1` is fed to `cmd2` as its standard input. The standard output from `cmd2` is connected to the terminal screen. What the shell does here is: It reconnects the standard input and output streams so that data flows from the keyboard input through the two commands and is then output to the screen. This is how pipes function. Okay, now that we now what exactly a pipe is and have an idea how it works, the next big question is: How on earth do we create a "Pipe" programmatically on a Unix system? On Unix systems, pipes may be considered open files that have no corresponding image in the mounted filesystems. A new pipe can be created by means of the `pipe()` system call, which returns a pair of file descriptors. The process can read from the pipe by using `read()` system call with the first file descriptor, and write into the pipe by using the `write()` system call with the second file descriptor. Now, pipes can be implemented in different ways on different systems. POSIX defines only "half-duplex" pipes. In "half-duplex" pipes, the `pipe()` system call does return two file descriptors, but each process must close one before using the other. Thus, if a two-way data flow is required one must use two different pipes by invoking the `pipe()` system call twice. This is how "half-duplex" pipes work. On other Unix systems, such as System V Release 4 (SVR4) Unix, pipes are implemented in a "full-duplex" manner. Full duplex allows both descriptors to be written into and read from at the same time. GNU/Linux, on the other hand, implements pipes in another unique manner. On Linux systems (that is GNU systems with Linux as the core - Kernel), pipe's file descriptors are one-way, but it is NOT necessary to close one of them before using the other. In this article, we will be dealing with POSIX style "half-duplex" pipes. (Reason: Linux uses "half-duplex" pipes but in a special way. Thus, "half-duplex" pipes will be covered.) Okay, enough said about "pipes"! Let's get going and see how we can create pipes on Unix Systems (POSIX style implemented) programmatically. The pipe function has the prototype: `#include <unistd.h> int pipe (int file_descriptor[2]);` `pipe` is passed (a pointer to) an array of two integer file descriptors. It fills the array with two new file descriptors and returns a zero. On failure, pipe system call returns `-1`. The errors defined in Linux man pages are: `EMFILE` : Too many file descriptors are in use by the process. `ENFILE` : The system file table is full. `EFAULT`: The file descriptor is not valid. The point to note here is that two file descriptors are returned and though distinct, are connected in a special way. Any data written to `file_descriptor[1]` can be read back from `file_descriptor[0]`. The data is processed in a first in, first out basis, usually referred to as FIFO. This means that if you write the bytes 2, 3, 4 to `file_descriptor[1]`, reading from `file_descriptor[0]` will produce exactly 2, 3, 4. Readers must note: This is entirely different from the operation of a stack, which works in a last in, first out (LIFO) basis. The real advantage of pipes comes when one wishes to pass data between two processes. In the program given below the program creates a pipe using the `pipe` system call. It then uses `fork` call to create a new process. If the `fork` call is successful, the parent writes data into the pipe, while the child reads data from the pipe. Both parent and child processes exit after a single write and read. The readers must note that if in case the parent exits before the child, they might see the shell prompt between the two outputs. The source code for our program `prog1` is as given below: `/* Pipes across a fork: By: Subhasish Ghosh Date: 15th August 2001 Place: Calcutta, WB, India E-mail:subhasish_ghosh@linuxmail.org*/` `#include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <string.h> int main() {` int data_processed; int file_pipes[2]; const char some_data[] = "123"; char buffer[BUFSIZ + 1]; pid_t fork_result; memset(buffer, '\0', sizeof(buffer)); if (pipe(file_pipes) = = 0) { fork_result = fork(); if (fork_result = = -1) { fprintf(stderr, "Fork Failure"); exit(EXIT_FAILURE); } if (fork_result = = 0) { data_processed = read(file_pipes[0], buffer, BUFSIZ); printf("Read %d bytes: %s\n", data_processed, buffer); exit(EXIT_SUCCESS); } else { data_processed = write(file_pipes[1], some_data, strlen(some_data)); printf("Wrote %d bytes\n", data_processed); } } exit(EXIT_SUCCESS); } After typing it in, save the file and compile it using: `#cc -o prog1 prog1.c`and then execute it using: `#./prog1` The output is as given below: `Wrote 3 bytes Read 3 bytes: 123`
	<< Page 2 of 5 >>