Efficiently writing to a log file from multiple threads

There’s a pattern I keep seeing in threaded programs (or indeed multiple processes) writing to a common log file. This is more of an antipattern than a pattern, and is often found in code that has existed for years.

Basically, it’s having a mutex to control concurrent writing to the log file. This is something you completely do not need.

The write system call takes care of it all for you. All you have to do is construct a buffer with your log entry in it (in C, malloc a char[] or have one per thread, in C++ std::string may do), open the log file with O_APPEND and then make a single write() syscall with the log entry.

This works for just about all situations you care about. If doing multi megabyte writes (a single log entry with multiple megabytes? ouch) then you may get into trouble on some systems and get partial writes (IIRC it may have been MacOS X and 8MB) and O_APPEND isn’t exactly awesome on NFS.

But, if what you’re wanting to do is implement something like a general query log, a slow query log or something like that, then you probably want to use this trick rather than, say, taking a pthread_mutex lock while you do malloc(), snprintf() and write(2).

When refactoring parts of Drizzle, we found this done the wrong way in a whole bunch of places in the MySQL server, largely explaining why things like the slow query log and general query log were such a huge drain on database server performance.

It’d be neat to see someone fix that.