The write()
system call is one of the fundamental building blocks for performing file I/O operations in C. It allows a program to write data from a buffer in memory to a file descriptor. In this comprehensive guide, we will dive deep into how to properly use write()
in C programs.
Overview of the write() System Call
Here is the function prototype for write()
:
ssize_t write(int fd, const void *buf, size_t count);
It takes three arguments:
fd
: The file descriptor to write to. This would be obtained by a previousopen()
system call.buf
: Pointer to the buffer containing the data to write.count
: The maximum number of bytes to write.
The write()
function attempts to write up to count
bytes from buf
to the file referenced by the file descriptor fd
.
On success, the number of bytes written is returned. This could be less than count
if there was insufficient storage space.
On error, -1
is returned, and errno
is set appropriately.
Opening a File for Writing
Before we can write to a file with write()
, we need to open the file using open()
. Here is an example:
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
int fd;
fd = open("file.txt", O_WRONLY | O_CREAT, 0644);
if (fd == -1) {
// error
}
This opens "file.txt" for writing only (O_WRONLY), creating it if it doesn‘t exist (O_CREAT), with read/write permissions for the owner and read permissions for group and others (0644).
The open()
call returns a file descriptor fd
that can be used in subsequent read()
/write()
calls.
Some key points:
O_WRONLY
opens the file write-only. UseO_RDWR
to open for reading and writing.- Always check for errors from
open()
before proceeding. - The file will be truncated if it already exists. Use
O_APPEND
to append instead.
With the file open for writing, we can now look at calling write()
.
Writing Data to a File
The simplest way to use write()
is to specify a buffer and length to write:
const char *msg = "Hello World";
ssize_t len = strlen(msg);
ssize_t bytes_written = write(fd, msg, len);
This will attempt to write the contents of msg
("Hello World"
) to the file descriptor fd
.
The return value stored in bytes_written
tells us how many bytes were successfully written.
We should always check for errors after calling write()
:
if (bytes_written == -1) {
// error occurred
} else if (bytes_written != len) {
// couldn‘t write entire buffer
}
-1
means an error occurred, such as permissions issue or disk full- If
bytes_written
is less thanlen
, it indicates there was insufficient storage space to write the entire buffer.
Otherwise, the write was successful and we wrote the exact number of bytes we intended to.
Writing Binary Data
For binary data, we simply replace the text buffer with pointers to structures:
struct data {
int values[100];
};
struct data d;
// populate structure
ssize_t bytes_written = write(fd, &d, sizeof d);
This writes the binary contents of d
to file. No special formatting or handling is needed.
Appending Data to a File
To append rather than overwrite, open the file with O_APPEND
:
fd = open("file.txt", O_WRONLY | O_APPEND);
Any writes now go to the end of the file.
We can also enable/disable O_APPEND
after opening using fcntl()
:
// enable append
fcntl(fd, F_SETFL, O_APPEND);
// disable append
fcntl(fd, F_CLRFL, O_APPEND);
This allows switching between normal writes and appending without closing/reopening the file.
Avoiding Interleaved Writes
With multiple threads or processes writing to a single file, write()
calls can get interleaved and data corrupted.
For example, if process A writes "Hello " and process B then writes "World!", the file may end up containing "HelWorldlo!".
To avoid this race condition, we need to use some form of file locking.
POSIX provides both advisory and mandatory locking schemes to handle this issue.
Advisory Locking
Advisory locking is visible only to cooperating processes that examine lock status before accessing the file.
For example:
// set exclusive lock
struct flock fl;
fl.l_type = F_WRLCK;
fcntl(fd, F_SETLK, &fl);
// write data...
// clear lock
fl.l_type = F_UNLCK;
fcntl(fd, F_SETLK, &fl);
By using F_SETLK
with an exclusive lock (F_WRLCK), we ensure only one process at a time can hold the advisory lock when writing to the fd.
If another process already holds the lock, our F_SETLK
will fail rather than wait for the lock. We can use F_SETLKW
to block if desired until the lock is available.
So advisory locking only works reliably among cooperative processes that check lock state before writing.
Mandatory Locking
With mandatory file/record locking, attempts to access a locked file region will always fail or block, regardless of whether processes check lock state explicitly.
To set up mandatory locking:
// enable mandatory locking on file
posix_fallocate(fd, 0, filesize);
fcntl(fd, F_SETFL, O_RDONLY);
Reads and writes to locked regions will now automatically fail rather than being interleaved.
Mandatory locking ensures data integrity without requiring processes to manually check lock state before reading/writing. The downside is decreased performance due to additional system calls to check and deny locked I/O requests.
So in summary:
- Advisory locks – processes must cooperate and check locks before reading/writing
- Mandatory locks – reads/writes automatically blocked if locked by another process
Efficient File Writing Using Buffering
System calls like write()
require context switches from user mode to kernel mode. This can limit performance, especially for small writes.
Buffering writes can help by reducing total system calls:
#define BUF_SIZE 8192
char buf[BUF_SIZE];
int filled = 0;
void write_buffer(int fd) {
int bytes_written = write(fd, buf, filled);
// error handling
filled = 0; // reset buffer
}
void add_to_buffer(const char *data, int len) {
memcpy(buf + filled, data, len);
filled += len;
if (filled >= BUF_SIZE) {
write_buffer(fd);
}
}
// then call add_to_buffer() to queue writes
By buffering up writes and only calling write()
periodically or when the buffer fills, we reduce the number of system calls required.
For small random writes, buffering like this can increase performance significantly.
Parallel Writes Using Async I/O
On multiprocessing systems, we can use asynchronous I/O to perform writes in parallel for maximum throughput.
The basic process is:
- Open the file descriptor in non-blocking mode
- Initiate async write using
aio_write()
- Process can continue other work while write happens in background
- Check status with
aio_return()
or wait for signal when complete
For example:
// open in non-blocking mode
fcntl(fd, F_SETFL, O_NONBLOCK);
struct aiocb cb;
cb.aio_fildes = fd;
cb.aio_buf = buffer;
cb.aio_nbytes = len;
aio_write(&cb); // non-blocking
// do other work...
// wait for write to finish
while (aio_error(&cb) == EINPROGRESS) {
// poll
}
// get status
int ret = aio_return(&cb);
By using asynchronous I/O, we can queue multiple write operations in parallel. This allows maxing out disk I/O bandwidth especially on systems with multiple CPUs/cores.
The disadvantage of async I/O is added software complexity to manage multiple operations. So it mainly benefits high-performance servers doing heavy I/O.
Security Considerations
When writing files based on external or user-supplied input, be aware of the security risks, such as:
Directory Traversal
Attackers may try to manipulate paths to write files outside of the expected directories:
// vulnerable code
filename = get_user_input();
fd = fopen(filename, "w"); // problem!
fwrite(data, 1, len, fd);
By inputting paths like "../../../../etc/passwd", attackers can write files anywhere on the system.
To avoid directory traversal attacks:
- Validate user paths to remove special chars like
..
- Call
realpath()
on paths to resolve them before opening - Store files in dedicated directories not directly in
/
Symbolic Links
A symbolic link is a special file that points to another file or directory:
ln -s /home/user/important_file symlink
If an attacker can create arbitrary symlinks, they could cause writes to overwrite critical files:
// vulnerable code
fd = fopen(user_input, "w");
fwrite(data, 1, len, fd);
If the user creates symlink
pointing to /etc/passwd
, writing the fd will modify that file!
To avoid issues with symlinks:
- Follow symlinks on paths and validate targets before writing
- Call
fchflags(path, UF_NOFOLLOW)
to disallow symlinks
So in summary, be vigilant about sanitizing external input used for filenames or paths passed to write()
. Use available mechanisms like flags and access checks to prevent overwriting unexpected filesystem locations.
Conclusion
The write()
system call is fundamental to writing data to files in C. With proper error checking, input validation, and concurrency control, it can be used securely and efficiently.
Key takeaways include:
- Open files for writing using flags like
O_WRONLY
andO_APPEND
- Always handle errors – check return value and update
errno
- Use locking where needed to prevent interleaved writes
- Employ buffering and asynchronous I/O for better performance
- Take care to sanitize all filesystem paths and filenames
By understanding these best practices for write()
, you can build robust applications in C that store and manipulate data files with confidence. The techniques outlined here should provide a solid foundation for leveraging files in your programs.