[Home] [Credit Search] [Category Browser] [Staff Roll Call] | The LINUX.COM Article Archive |
Originally Published: Friday, 8 December 2000 | Author: Elmo Recio |
Published to: enhance_articles_sysadmin/Sysadmin | Page: 1/1 - [Std View] |
Working with Temporary Files under Linux (and Unix!)
Cleanliness is next to... well, you know the saying. Elmo Recio cuts to the chase and shows us a few tips and tricks that save time and space. Elmo, how do you make it look so easy?
|
Many flavors of Linux/Unix offer a host of system calls that can be used for manipulating temporary files. Why on earth would you ever need to use special calls for manipulating temporary files? Consider that you are on a multi-user, multi-threaded OS. Hundreds, possibly thousands of processes are running at any one time. Many of these processes are forked, daemons, and/or sleeping. Now, imagine if user-a is running a process called foo. This program needs to process large amounts of data one line at a time; as it processes each line it must save it somewhere for reprocessing. Well, this is simple, just create a file in the temp volume and store the data there. No problem so far, right? What if user-b wants to run the same process for a dataset of his own? The program foo will attempt to create an identical file in the temp volume and write to it. Now the problem is obvious. The result will be two very unhappy users.
There are several ways to work around this problem. The program can check for the existence of the file, and use an alternate filename. This method would make the program quite bulky because it's going to have to check for the existence of the second filename on the temp volume and increment to the third name if the file exists. Let's increase the number of running foo programs to 100. You can quickly see how this could create a race condition- especially if the foo program is executed at exactly the same moment. Wouldn't it be nice if the system could take care of this tedious programming? You wouldn't have to code it each time you write a new program that needs to make use of temporary files.
Here comes Linux/Unix to save the day. Sometimes you need a file to store temporary data. Data that (in any case) only needs to be alive long enough to be used by the process, and be deleted later. Here is where the following system functions come in handy: tmpnam(), mkstemp(), tmpfile(), tempnam(), and as a result, unlink().
Creating names for temporary files with tmpnam()
#include <stdio.h>The purpose of the above file is to generate path names for new temporary files that do not yet exist (on Sun Solaris systems by default the file will be placed in /var/tmp). It takes as an argument a malloc()'d character string of at least L_tmpnam (this constant is defined in stdio.h) length that will be populated by the function. You can, however, send it NULL as an argument. In which case, it will return a pointer to an internally created array. Note: if you send it NULL as an argument, it will overwrite the space the next time you call it; be sure to strcpy() the string at the pointer returned to you before calling tmpnam() again.char *tmpnam(char *s);
Consider the following program (0101-p1.c):
Stepping through the program we have:
$ mainOk so you have a temporary file name created; but, if you were to make another call to tmpnam(), you would notice that temp_pathname would be renamed, even if you assigned the return value to another pointer. So the best thing to do is create a buffer, and pass the buffer to the tmpnam() function.Using temporary pathname /var/tmp/aaaVtaivC
Consider the following program (0101-p2.c):
The constant L_tmpnam is defined in stdio.h and its size system dependant. In this way, if later on in the program you find that you need another temporary pathname, you can call tmpnam() and not worry about the pathname that was already created to be overwritten.
Creating custom template filenames with mkstemp()
Sometimes you see that some programs create temporary pathnames with the first few characters as an abbreviated version of the program name. One such program that comes to mind is vi. These programs are able to do this with the function called mkstemp().
#include <stdlib.h> int mkstemp(char *template);The mkstemp() functions replaces part of the character string template with a unique filename. template is a null terminated C string composing of a filename plus six (6) 'X' characters. mkstemp() replaces the six 'X's with its own unique identifier. You can place any pathname at the beginning of the string that you want, but the last six characters must be composed of 'X'.
For example if you were to copy "myprog-XXXXXX" into a character string, and pass that string to the mkstemp() function, it would return something like: "myprog-ZasdDv". The function then returns an open file descriptor to the newly created file.
Consider the following program (0102-p1.c):
Stepping through the program:
$ 0102-p1Using temporary file: /var/tmp/pdn-CaaWHC
Displaying the contents of the file will show that we were indeed, here.
$ cat /var/tmp/pdn-CaaWHCcreated by pid=14606
Creating and opening a file stream with tmpfile()
Sometimes you want the system to do all the file opens for you. All you want is the stream! This is good; it means less coding of the tedious stuff, and more time for real coding. This is where the tmpfile() function comes in handy.
#include <stdio.h>The function tmpfile() is actually very easy to use, and comes in handy when you want to generate, create and open a temporary file stream. Consider the following program (0103-p1.c):FILE *tmpfile(void);
$ 0103-p1Whoah! Where did the filename go? That's the nice part about this function. There's no cleanup necessary. We don't have to worry about deleting files when we are done with them. The system never assigns a filename to the temporary file that we can use. The only way this file stream is accessible is within the program itself. When the program exits, the kernel cleans up the space occupied by the file, and reclaims it as free space.rewound, and read: Created by PID=14658
But, the user wants to set his own temp directory, then use tmpnam()
Let's say the user wants to keep the data in his own temporary directory. Maybe the data is secretive (or .jpgs- bad user!); then we have the blessing of the tempnam() function. It's tricky to get a hang of, but it works.
#include <stdio.h>This function allow you to control the directory where the temp pathname points to. The argument dir points to the directory where the file with pfx prepended to it. The argument pfx is a null terminated character string of up to 5 characters that will be prepended to the filename that will be created.char *tempnam(const char *dir, const char *pfx);
If you want, you can leave pfx null. It'll just create a normal unique filename. If you leave dir null, it'll create some tricky results you should look out for; the best way to find out how your arguments will be handled on the Linux/Unix variant system is to refer to the manual pages for tempnam().
SunOS or Solaris systems behave a little different the following occurs:
Consider the following program (0104-p1.c):
Stepping through the program:
$ echo $TMPDIRCustom filenames without the headaches: use unlink()/tmp
$ ./0104-p1
Using temp pathname as /tmp/pdn4-AAAyCaiTC
$ unset TMPDIR
$ ./0104-p1
Using temp pathname as ./pdn4-AAAhgaqTC
#include <unistd.h> int unlink(const char *path);The function unlink() takes a null terminated C string denoting an existing pathname to a file. It returns 0 on success, otherwise -1 is returned and int errno is set so that we can print it with perror().
Consider the following program (0105-p1.c):
Stepping through the program:
$ 0105-p1Why does this work? If you have ever played evil sysadmin with any ftp users, you may recall that if someone was downloading a file, and you deleted it on them, or moved it to another physical volume, and then did a df -k on the volume, the size would not have changed. It would be as if you never moved the file at all... yet the file isn't there.Using pathname /var/tmp/aaaGVaWeD
Opened the file and deleted it.
Read from deleted file: Writing to file deleted by 14886
$ ls -l /var/tmp/aaaGVaWeD
/var/tmp/aaaGVaWeD not found
This is because the kernel keeps the an open file table entry for it. When you call unlink() the kernel removes the filename for the file, but physically the file isn't removed. The file becomes a file with no links to it (in effect, nameless). The kernel only reclaims the space from the file after the process using it has closed the file.