Spawning Threads Within Transactions

The following pseudo-code fragment might be unconventional, but is meaningful, well-defined, and useful:

pthread_mutex_lock(...); for (i = 0; i < ncpus; i++) tid[i] = pthread_create(...); for (i = 0; i < ncpus; i++) pthread_join(tid[i], ...) pthread_mutex_unlock(...);

This pseudo-code fragment uses pthread_create() to spawn one thread per CPU, then uses pthread_join() to wait for each to complete, all under the protection of pthread_mutex_lock(). The effect is to execute a lock-based critical section in parallel. Of course, the critical section would need to be quite large to justify the thread-spawning overhead, but there are many examples of large critical sections in production software. It is also legal to spawn threads within other types of critical sections, for example reader-writer locks and RCU.

What might TM do about thread spawning within a transaction?

Declare pthread_create() to be illegal within transactions, resulting in transaction abort (preferred) or undefined behavior. Alternatively, enlist the compiler to enforce pthread_create()-free transactions.
Permit pthread_create() to be executed within a transaction, but only the parent thread will be considered to be part of the transaction. This approach seems to be reasonably compatible with existing and posited TM implementations, but seems to be a trap for the unwary. This approach raises further questions, such as how to handle conflicting child-thread accessses.
Convert the pthread_create()s to function calls. This approach is also an attractive nuisance, as it does not handle the not-uncommon cases where the child threads communicate with one another. In addition, it does not permit parallel execution of the body of the transaction.
Extend the transaction to cover the parent and all child threads. This approach raises interesting questions about the nature of conflicting accesses, given that the parent and children are presumably permitted to conflict with each other, but not with other threads. It also raises interesting questions as to what should happen if the parent thread does not wait for its children before committing the transaction. Even more interesting, what happens if the parent conditionally executes pthread_join() based on the values of variables participating in the transaction? The answers to these questions are reasonably straightforward in the case of locking. The answers for TM are left as an exercise for the reader.

Given that parallel execution of transactions is commonplace in the database world, it is perhaps surprising that current TM proposals do not provide for it. On the other hand, the example above is a fairly sophisticated use of locking that is not normally found in simple textbook examples, so perhaps its omission is to be expected. That said, there are rumors that some TM researchers are investigating fork/join parallelism within transactions, so perhaps this topic will soon be addressed more thoroughly.