Originally Published: Friday, 8 December 2000 Author: Elmo Recio
Published to: enhance_articles_sysadmin/Sysadmin Page: 1/1 - [Printable]

Working with Temporary Files under Linux (and Unix!)

Cleanliness is next to... well, you know the saying. Elmo Recio cuts to the chase and shows us a few tips and tricks that save time and space. Elmo, how do you make it look so easy?

   Page 1 of 1  

Many flavors of Linux/Unix offer a host of system calls that can be used for manipulating temporary files. Why on earth would you ever need to use special calls for manipulating temporary files? Consider that you are on a multi-user, multi-threaded OS. Hundreds, possibly thousands of processes are running at any one time. Many of these processes are forked, daemons, and/or sleeping. Now, imagine if user-a is running a process called foo. This program needs to process large amounts of data one line at a time; as it processes each line it must save it somewhere for reprocessing. Well, this is simple, just create a file in the temp volume and store the data there. No problem so far, right? What if user-b wants to run the same process for a dataset of his own? The program foo will attempt to create an identical file in the temp volume and write to it. Now the problem is obvious. The result will be two very unhappy users.

There are several ways to work around this problem. The program can check for the existence of the file, and use an alternate filename. This method would make the program quite bulky because it's going to have to check for the existence of the second filename on the temp volume and increment to the third name if the file exists. Let's increase the number of running foo programs to 100. You can quickly see how this could create a race condition- especially if the foo program is executed at exactly the same moment. Wouldn't it be nice if the system could take care of this tedious programming? You wouldn't have to code it each time you write a new program that needs to make use of temporary files.

Here comes Linux/Unix to save the day. Sometimes you need a file to store temporary data. Data that (in any case) only needs to be alive long enough to be used by the process, and be deleted later. Here is where the following system functions come in handy: tmpnam(), mkstemp(), tmpfile(), tempnam(), and as a result, unlink().

Creating names for temporary files with tmpnam()

#include <stdio.h>

char *tmpnam(char *s);

The purpose of the above file is to generate path names for new temporary files that do not yet exist (on Sun Solaris systems by default the file will be placed in /var/tmp). It takes as an argument a malloc()'d character string of at least L_tmpnam (this constant is defined in stdio.h) length that will be populated by the function. You can, however, send it NULL as an argument. In which case, it will return a pointer to an internally created array. Note: if you send it NULL as an argument, it will overwrite the space the next time you call it; be sure to strcpy() the string at the pointer returned to you before calling tmpnam() again.

Consider the following program (0101-p1.c):

#include <stdio.h> /* duh */
#include <string.h> /* string manipulation */
#include <errno.h> /* perror */
int main(int argc, char **argv)
{
char *temp_pathname; /* pointer that will hold pathname */
FILE *temp_fileptr = 0; /* file pointer that will be created */
if ( !(temp_pathname=tmpnam(NULL)) ) {
perror("error making temporary filename");
abort();
}
fprintf(stdout, "Using temporary pathname %s\n", temp_pathname);
if ( !(temp_fileptr = fopen(temp_pathname, "w")) ) {
perror("error creating temporary file");
abort();
}
return 0;
}
Stepping through the program we have:
  1. The tmpnam() call returns a pointer to a character string located deep within the tmpnam() function.
  2. If tmpnam() cannot find a temporary pathname it returns null and sets int errno appropriately. So we peek at errno with the perror() function.
  3. We print to standard out, the pathname that was created.
  4. Finally we attempt to open a file using the pathname.
If you compile and run the above program, you should get something like this:
$ main

Using temporary pathname /var/tmp/aaaVtaivC

Ok so you have a temporary file name created; but, if you were to make another call to tmpnam(), you would notice that temp_pathname would be renamed, even if you assigned the return value to another pointer. So the best thing to do is create a buffer, and pass the buffer to the tmpnam() function.

Consider the following program (0101-p2.c):

#include <stdio.h>
#include <string.h>
#include <errno.h>
int main(int argc, char **argv)
{
char temp_pathname[L_tmpnam];
FILE *temp_fileptr = 0;
if ( !(tmpnam(temp_pathname)) ) {
perror("error making temporary filename");
abort();
}
fprintf(stdout, "Using temporary pathname %s\n", temp_pathname);
if ( !(temp_fileptr = fopen(temp_pathname, "w")) ) {
perror("error creating temporary file");
abort();
}
return 0;
}
The constant L_tmpnam is defined in stdio.h and its size system dependant. In this way, if later on in the program you find that you need another temporary pathname, you can call tmpnam() and not worry about the pathname that was already created to be overwritten.

Creating custom template filenames with mkstemp()

Sometimes you see that some programs create temporary pathnames with the first few characters as an abbreviated version of the program name. One such program that comes to mind is vi. These programs are able to do this with the function called mkstemp().

#include <stdlib.h> int mkstemp(char *template);
The mkstemp() functions replaces part of the character string template with a unique filename. template is a null terminated C string composing of a filename plus six (6) 'X' characters. mkstemp() replaces the six 'X's with its own unique identifier. You can place any pathname at the beginning of the string that you want, but the last six characters must be composed of 'X'.

For example if you were to copy "myprog-XXXXXX" into a character string, and pass that string to the mkstemp() function, it would return something like: "myprog-ZasdDv". The function then returns an open file descriptor to the newly created file.

Consider the following program (0102-p1.c):

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
int main (int argc, char **argv)
{
char temp_pathname[64]; /* will hold the temp filename */
int temp_filedes = -1; /* temporary file descriptor */
FILE *temp_fileptr = 0; /* pointer to file stream */
strcpy(temp_pathname, "/var/tmp/pdn-XXXXXX"); /* copy template */
if ( (temp_filedes=mkstemp(temp_pathname)) <0 ) {
perror("generating temp filename");
abort();
}
fprintf(stdout, "Using temporary file: %s\n", temp_pathname);
temp_fileptr = fdopen(temp_filedes, "w"); /* open file stream */
fprintf(temp_fileptr, "created by pid=%ld\n", (long)getpid());
fclose(temp_fileptr);
return 0;
}
Stepping through the program:
  1. Allocate 64 bytes to temp_pathname (an arbitrary number that I selected which is enough breathing room for the string)
  2. The template string "/var/tmp/pdn-XXXXXX" is copied into temp_pathname.
  3. We call mkstemp() with temp_pathname as the argument.
  4. The return value (if greater than 0) is a valid opened file descriptor. Otherwise, it's an error and we call perror() to see what went wrong.
  5. We then print out what filename we have been assigned.
  6. We use the call fdopen() to open a file stream from a current file descriptor.
  7. Finally we write our process id to the opened stream.
Running the program would reveal the following results:
$ 0102-p1
Using temporary file: /var/tmp/pdn-CaaWHC

Displaying the contents of the file will show that we were indeed, here.

$ cat /var/tmp/pdn-CaaWHC
created by pid=14606

Creating and opening a file stream with tmpfile()

Sometimes you want the system to do all the file opens for you. All you want is the stream! This is good; it means less coding of the tedious stuff, and more time for real coding. This is where the tmpfile() function comes in handy.

#include <stdio.h>

FILE *tmpfile(void);

The function tmpfile() is actually very easy to use, and comes in handy when you want to generate, create and open a temporary file stream. Consider the following program (0103-p1.c):
#include
#include
#include
#include
int main(int argc, char **argv)
{
FILE *temp_filestream;
char buffer[1024];
if ( !(temp_filestream=tmpfile()) ) {
perror("generating temporary stream");
abort();
}
fprintf(temp_filestream, "Created by PID=%ld\n", (long)getpid());
fflush(temp_filestream);
rewind(temp_filestream);
fgets(buffer, 1024, temp_filestream);
fprintf(stdout, "rewound, and read: %s\n", buffer);
return 0;
}
Stepping through the program:
  1. We create a buffer to hold some data; 1024 should be enough.
  2. We call tmpfile() function.
  3. tmpfile() return a pointer to a file stream which is opened and ready to roll. If null is returned then an error occured and we use perror() to find out what happened.
  4. We then write some text to the file stream. Then flush the buffer (make sure every character gets written to the stream)
  5. We rewind the stream to the beginning.
  6. Finally we read in a line from the stream starting at the beginning and print it out to the screen.
Running the program would reveal the following results:
$ 0103-p1

rewound, and read: Created by PID=14658

Whoah! Where did the filename go? That's the nice part about this function. There's no cleanup necessary. We don't have to worry about deleting files when we are done with them. The system never assigns a filename to the temporary file that we can use. The only way this file stream is accessible is within the program itself. When the program exits, the kernel cleans up the space occupied by the file, and reclaims it as free space.

But, the user wants to set his own temp directory, then use tmpnam()

Let's say the user wants to keep the data in his own temporary directory. Maybe the data is secretive (or .jpgs- bad user!); then we have the blessing of the tempnam() function. It's tricky to get a hang of, but it works.

#include <stdio.h>

char *tempnam(const char *dir, const char *pfx);

This function allow you to control the directory where the temp pathname points to. The argument dir points to the directory where the file with pfx prepended to it. The argument pfx is a null terminated character string of up to 5 characters that will be prepended to the filename that will be created.

If you want, you can leave pfx null. It'll just create a normal unique filename. If you leave dir null, it'll create some tricky results you should look out for; the best way to find out how your arguments will be handled on the Linux/Unix variant system is to refer to the manual pages for tempnam().

SunOS or Solaris systems behave a little different the following occurs:

  1. If the user has the environment variable TMPDIR exported to the program, then this will always be used, regardless of the dir argument.
  2. If the environment variable TMPDIR is not exported to the process, and dir points to an appropriate name, then the file will be created in the directory dir. If dir is NULL or an inappropriate name for a directory (eg: it doesn't exist, or is not writeable) then it checks the C macro P_tmpdir in stdio.h.
  3. If all else fails, it attempts to allocate the file in the temporary volume.
In any case, what is returned, is an internally malloc()'d buffer with the pathname created based on the conditions above. So make sure you call free() on the pathname when you are done with the file.

Consider the following program (0104-p1.c):

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
int main (int argc, char **argv)
{
char *temp_pathname = 0;
FILE *temp_filestream = 0;
if ( !(temp_pathname=tempnam("./","pdn4-")) ) {
perror("error creating pathname");
}
fprintf(stdout, "Using temp pathname as %s\n", temp_pathname);
if (!(temp_filestream=fopen(temp_pathname, "w")) ) {
perror("error opening file");
}
fprintf(temp_filestream, "Created by PID=%ld\n", (long)getpid());
fclose(temp_filestream);
return 0;
}
Stepping through the program:
  1. We call the tempnam() function passing along "./" (the current directory) as the dir argument and "pdn4-" as the pfx.
  2. If we encounter an error, we call perror() to see what went wrong.
  3. We then print out the pathname created to standard output.
  4. Open a file using temp_pathname as the name of the file to open and catch the file stream pointer returned.
  5. We then print the process id into the file stream opened.
  6. Finally we close the file.
Running the program would reveal the following results:
$ echo $TMPDIR

/tmp

$ ./0104-p1

Using temp pathname as /tmp/pdn4-AAAyCaiTC

$ unset TMPDIR

$ ./0104-p1

Using temp pathname as ./pdn4-AAAhgaqTC

Custom filenames without the headaches: use unlink()

You were impressed by the custom filename functions. But you don't want to deal with the headaches of remembering to delete them when you are done with them. Handling these are as easy as pie. Soon after creating the pathname, open the file, and call unlink() the filename.
#include <unistd.h> int unlink(const char *path);
The function unlink() takes a null terminated C string denoting an existing pathname to a file. It returns 0 on success, otherwise -1 is returned and int errno is set so that we can print it with perror().

Consider the following program (0105-p1.c):

#include
#include
#include
#include
int main(int argc, char **argv)
{
char *temp_pathname;
FILE *temp_fstrm = 0;
char buffer[1024];
if ( !(temp_pathname=tmpnam(NULL)) ) {
perror("error making temporary filename");
abort();
}
fprintf(stdout, "Using pathname %s\n", temp_pathname);
if ( !(temp_fstrm = fopen(temp_pathname, "w+")) ) {
perror("error creating temporary file");
abort();
}
unlink(temp_pathname);
fprintf(stdout, "Opened the file and deleted it.\n");
fprintf(temp_fstrm, "Writing to file deleted by %ld\n",(long)getpid());
fflush(temp_fstrm);
rewind(temp_fstrm);
fgets(buffer, 1024, temp_fstrm);
fprintf(stdout, "Read from deleted file: %s\n", buffer);
return 0;
}
Stepping through the program:
  1. We allocate a temporary buffer to store some data (I selected an arbitrary length of 1024 bytes.)
  2. We then call tmpnam() to get a temporary filename.
  3. We print out what file name we are using.
  4. We then open the file for reading/writing (in append mode)
  5. We call unlink() to remove the opened file from the filesystem.
Running the program would produce the following results:
$ 0105-p1

Using pathname /var/tmp/aaaGVaWeD

Opened the file and deleted it.

Read from deleted file: Writing to file deleted by 14886

$ ls -l /var/tmp/aaaGVaWeD

/var/tmp/aaaGVaWeD not found

Why does this work? If you have ever played evil sysadmin with any ftp users, you may recall that if someone was downloading a file, and you deleted it on them, or moved it to another physical volume, and then did a df -k on the volume, the size would not have changed. It would be as if you never moved the file at all... yet the file isn't there.

This is because the kernel keeps the an open file table entry for it. When you call unlink() the kernel removes the filename for the file, but physically the file isn't removed. The file becomes a file with no links to it (in effect, nameless). The kernel only reclaims the space from the file after the process using it has closed the file.


Elmo@northsys.net




   Page 1 of 1