Levels of thread safety and thread compatibility

Software which works correctly in the presence of multiple threads of execution is called thread-safe, but what should the 'correct' behaviour be? For a class (or any library implementing a set of operations on some sort of objects), there are several different levels of thread-compatibility. In summary, these are:

NameParallelismAtomicity
Thread-hostileNonen/a
Thread-lockedPossibly
Thread neutralMaximumSome
Thread-clumsySomeAll except destruction
Thread-wary
Thread-paranoidMinimalAll
Thread-hostile
The software is incompatible with threads. for example, it uses static or global variables, or is not re-entrant. An example of this is the standard function ctime(), which is defined to return a pointer to a result string which may be overwritten by another call to ctime(). Obviously this means that a thread invoking ctime() can never be sure that the result hasn't immediately been overwritten by another thread which is invoking ctime() at the same time. (Actually, ctime() can be implemented in a way which avoids this problem by using thread-local storage, but this example shows what happens if it isn't implemented like that).
		// thread 1		// thread 2
		s = ctime(t1);		t = ctime(t2);

		// Oops!
		// s and t point to the same buffer
		// so cannot have different results

It could be thread-hostile in another sense - with respect to a particular threading library - by defining a name which conflicts with a name used by the threading library. For example, on a POSIX system, any software that defines pthread_create cannot be used with a POSIX threading library. In the case of POSIX threads, its use of prefixes such as pthread_ makes conflicts unlikely.

Thread-locked
Parts of the software can be invoked in a multi-threaded program, but some operations must be done by specific threads. For example, on Microsoft Windows, an I/O operation can only be cancelled by calling CancelIO() in the same thread which initiated the I/O operation.

This level of thread compatibility is rare, since it generally takes quite hard work to make operations be tied to a particular thread.

Thread-neutral
Operations appear atomic if no two operations are attempted simultaneously on the same object. This is the level of thread-safety achieved by the basic types of the C programming language.

This level of compatibility is quite common, since it is the natural result of the most obvious implementation of most code. It only requires special work to achieve when objects share state which is concealed by the interface. An example, of this is a string class which avoids copying string data by using reference counts and common data buffers. Such a class would be thread-hostile unless it handled the modifications to the shared data in an appropriate manner.

Thread-clumsy
All operations on objects, except destruction, are atomic but operations use hidden state. In other words, if two threads attempt to operate on the same object at the same time, the result is always the same as one operation followed by the other operation. However, the operations implemented are inherently incapable of being thread-safe, so the atomicity is useless.

For example, the stdio library defines fseek() and fread() as being atomic. This is completely useless, since the ability to seek to a particular position in the file is only useful if you can be sure that the subsequent read is performed from that position. Since any use of fseek() must ensure that the fseek() / fread() pair are atomic, there is no need for the stdio library to make them individually atomic.

		// thread 1		// thread 2
		fseek(file, somewhere);
					fseek(file, somewhere else);
		fread(file);

		// Oops! The data has been read from the wrong place

An equally pointless atomicity guarantee of the stdio library is the gusrantee that putchar() is atomic. While it is conceivable that some uses of text files, such as logging, may not care if printf()s in different threads get interleaved, it is difficult to imagine any practical application in which different threads use putchar() on the same file and it doesn't matter that the individual characters are interleaved. However the cost of this guarantee is substantial - in a typical implementation, putchar() must be a function call, and must lock and unlock a mutex. An implementation which merely tried to be thread-neutral could use a macro with no locking.

		// thread 1		// thread 2
		putchar('h');
		putchar('e');
		putchar('l');
					putchar('w');
		putchar('l');
					putchar('o');
					putchar('r');
		putchar('o');
					putchar('l');
					putchar('d');

		// Oops! we get "helwlorold"
Thread-wary
All operations on objects, except destruction, are atomic and the operations do not have hidden state (such as the file position). In other words, if two threads attempt to operate on the same object at the same time, the result is always the same as one operation followed by the other operation.

This level of thread-compatibility is usually excessively safe because the overhead required to make each operation atomic is wasted if the software which invokes the operations has, as a side-effect of its own thread-safety, guaranteed that two threads cannot operate on the same object at the same time.

Thread-paranoid
All operations on are atomic, including object destruction. This is obviously silly unless it is achieved as a side-effect of something else, since it means that if one thread tries to destroy an object at the same time as another thread tries to do another operation on the same object, the result is either the same as operating on the object and then destroying it (which is sensible) or destroying it first and then operating on it (which is not). Obviously, an object that can be operated on after destruction would be very bizarre, so for normal objects it is wasteful to try to make destruction atomic with respect to other operations on the same object.

Optimal thread compatibility

For the vast majority of situations, the best level of thread-compatibility for a library is for it to be thread-neutral. This maximises the parallelism, minimises the overhead of locking, and makes the objects behave the same as objects of the builtin types.

It is possible to use a thread-hostile library by wrapping calls into the library in functions which acquire a global lock, call the library, and then release the lock. This guarantees that the library can only ever be invoked by one thread at a time, compensating for its thread-hostility. This technique is equivalent to the way first used to allow Unix variants to operate on multi-CPU systems. Since the kernel was thread-hostile, and each CPU ran one thread, all entry points to the kernel started by acquiring a single global lock. Obviously this technique prevents any parallel processing, which defeats the purpose of using multiple threads, so it is of very limited usefulness - only suitable for situations where it is too difficult to fix a thread-hostile library and sufficiently little time is spent in it that it is not a performance bottleneck to allow only one thread to use it at a time.

Similarly, a thread-locked library can be forced to co-operate. This can be done by nominating one thread to perform all operations in the library, and when any other thread needs an operation to be performed, it arranges for the nominated thread to perform it. This may be achieved by some form of work queue, which is processed by the nominated thread and onto which other thread put their work. Obviously, just like the technique using a single global lock on a thread-hostile library, this technique also prevents any parallel processing, again defeating the purpose of using multiple threads. It's even worse, because it also requires extra context-switches into and out of the nominated worker thread. Consequently, it may be better to simply avoid the thread-locked operations in a library - which what typically happens with Microsoft Windows as few people know about CancelIO() and even fewer use it.

While some people are attracted to the simplicity of the thread-wary level, it is very easy to try to achieve this level only to accidentally achieve the level of thread-clumsy, where the cost of atomicity is paid, but the atomicity is never needed. It is also important to note that the basic C data types, such as int are not thread-wary - they are thread-neutral, so any use of the library will be highly likely to need to use some form of locking anyway, and will certainly need to be written with thread-safety in mind.

The reason why it is usually best to be thread-neutral can be seen by considering two functions: unsafe_f() and unsafe_g(). If these are made individuall thread-safe by implementing:

f()
{
	lock();
	unsafe_f();
	unlock();
}

g()
{
	lock();
	unsafe_g();
	unlock();
}

then if the user wants to call them in sequence:

	f();
	g();

this performs:

	lock();
	unsafe_f();
	unlock();
	lock();
	unsafe_g();
	unlock();

whereas it would obviously be better to perform:

	lock();
	unsafe_f();
	unsafe_g();
	unlock();

which the user can only do if they can call the unsafe version directly.

In general, the principle is that it's better to be fast than safe because it's possible to build a slow, safe version from a fast unsafe version, but given a slow, safe version it's not possible to build a fast version.