I/O Operations Within Transactions
One can execute I/O operations within a lock-based critical section,
and, at least in principle, from within an RCU read-side critical
section.
What happens when you attempt to execute an I/O operation from within
a transaction?
The underlying problem is that transactions may be rolled back, for example,
due to conflicts.
Roughly speaking, this requires that all operations within any given
transaction be idempotent, so that executing the operation twice has
the same effect as executing it once.
Unfortunately, I/O is in general the prototypical non-idempotent operation,
making it difficult to include general I/O operations in transactions.
Here are some options available to TM:
- Restrict I/O within transactions to buffered I/O with in-memory
buffers. These buffers may then be included in the transaction
in the same way that any other memory location might be included.
This seems to be the mechanism of choice, and it does work well
in many common cases of situations such as stream I/O and
mass-storage I/O.
However, special handling is required in cases where multiple
record-oriented output streams are merged onto a single file
from multiple processes, as might be done using the “a+”
option to
fopen()
or the O_APPEND
flag to open()
.
However, as will be seen in a subsequent posting, common networking
operations cannot be handled via buffering.
- Prohibit I/O within transactions, so that any attempt to execute
an I/O operation aborts the enclosing transaction (and perhaps
multiple nested transactions).
This approach seems to be the conventional TM approach for
unbuffered I/O.
- Prohibit I/O within transactions, but enlist the compiler's aid
in enforcing this prohibition.
- Permit only one special
“inevitable” transactions
to proceed at any given time, thus allowing inevitable
transactions to contain I/O operations.
This works in general, but severely limits the scalability and
performance of I/O operations.
Given that scalability and performance is a first-class goal
of parallelism, this approach's generality seems a bit self-limiting.
- Create new hardware and protocols such that I/O operations can
be pulled into the transactional substrate.
In the case of input operations, the hardware would need to
correctly predict the result of the operation, and to abort the
transaction if the prediction failed.
I/O operations are of course a known weakness of TM, and it is not clear
that the problem of supporting I/O in transactions has a reasonable
general solution, at least if “reasonable” is to include
any requirement of meeting the performance and scalability goals of
parallel programming.
Nevertheless, the proponents of TM must either solve this problem, or
resign themselves to a world where TM is but one tool of several in
the parallel programmer's toolbox.