Transactions

An RDBMS must be able to group SQL statements so that they are either all committed, which means they are applied to the database, or all rolled back, which means they are undone. A transaction is a logical, atomic unit of work that contains one or more SQL statements.

An illustration of the need for transactions is a funds transfer from a savings account to a checking account. The transfer consists of the following separate operations:

Decrease the savings account.
Increase the checking account.
Record the transaction in the transaction journal.

PostgreSQL Database guarantees that all three operations succeed or fail as a unit. For example, if a hardware failure prevents a statement in the transaction from executing, then the other statements must be rolled back.

Transactions are one of the features that sets PostgreSQL Database apart from a file system. If you perform an atomic operation that updates several files, and if the system fails halfway through, then the files will not be consistent. In contrast, a transaction moves an Postgres database from one consistent state to another. The basic principle of a transaction is “all or nothing”: an atomic operation succeeds or fails as a whole.

Introduction to Transactions

A transaction is a logical, atomic unit of work that contains one or more SQL statements. A transaction groups SQL statements so that they are either all committed, which means they are applied to the database, or all rolled back, which means they are undone from the database. PostgreSQL Database assigns every transaction a unique identifier called a transaction ID.

All PostgreSQL transactions comply with the basic properties of a database transaction, known as ACID properties. ACID is an acronym for the following:

Atomicity

All tasks of a transaction are performed or none of them are. There are no partial transactions. For example, if a transaction starts updating 100 rows, but the system fails after 20 updates, then the database rolls back the changes to these 20 rows.
Consistency

The transaction takes the database from one consistent state to another consistent state. For example, in a banking transaction that debits a savings account and credits a checking account, a failure must not cause the database to credit only one account, which would lead to inconsistent data.
Isolation

The effect of a transaction is not visible to other transactions until the transaction is committed. For example, one user updating the hr.employees table does not see the uncommitted changes to employees made concurrently by another user. Thus, it appears to users as if transactions are executing serially.
Durability

Changes made by committed transactions are permanent. After a transaction completes, the database ensures through its recovery mechanisms that changes from the transaction are not lost.

The use of transactions is one of the most important ways that a database management system differs from a file system.

Sample Transaction: Account Debit and Credit

To illustrate the concept of a transaction, consider a banking database. When a customer transfers money from a savings account to a checking account, the transaction must consist of three separate operations:

Decrement the savings account
Increment the checking account
Record the transaction in the transaction journal

PostgreSQL must allow for two situations. If all three SQL statements maintain the accounts in proper balance, then the effects of the transaction can be applied to the database. However, if a problem such as insufficient funds, invalid account number, or a hardware failure prevents one or two of the statements in the transaction from completing, then the database must roll back the entire transaction so that the balance of all accounts is correct.

Figure 1 illustrates a banking transaction. The first statement subtracts $500 from savings account 3209. The second statement adds $500 to checking account 3208. The third statement inserts a record of the transfer into the journal table. The final statement commits the transaction.

Structure of a Transaction

A database transaction consists of one or more statements that together constitute an atomic change to the database. The statements in it can be either Data Manipulation Language (DML) statements or Data Definition Language (DDL) statements.

A transaction has a beginning and an end.

Beginning of a Transaction

In PostgreSQL, a transaction is set up by surrounding the SQL commands of the transaction with BEGIN and COMMIT commands. So our banking transaction would actually look like:

BEGIN;
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';
-- etc etc
COMMIT;

If, partway through the transaction, we decide we do not want to commit (perhaps we just noticed that Alice’s balance went negative), we can issue the command ROLLBACK instead of COMMIT, and all our updates so far will be canceled.

PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT is sometimes called a transaction block.

Some client libraries issue BEGIN and COMMIT commands automatically, so that you might get the effect of transaction blocks without asking. Check the documentation for the interface you are using.

When a transaction begins, Redrock Postgres assigns the transaction to an available undo segment to record the undo entries for the new transaction. A transaction ID is not allocated until an undo segment and transaction table slot are allocated, which occurs during the first DML/DDL statement. A transaction ID is unique to a transaction and represents the undo segment number, transaction slot number, and sequence number.

The following example execute an UPDATE statement to begin a transaction and queries for details about the transaction:

BEGIN;

UPDATE hr.employees SET salary = salary;

SELECT xid, pg_xact_status(xid) AS status
  FROM pg_current_xact_id_if_assigned() AS xid;
   xid    |   status
----------+-------------
 (5,16,1) | in progress

End of a Transaction

A transaction ends when any of the following actions occurs:

A user issues a COMMIT or ROLLBACK statement without a SAVEPOINT clause.

In a commit, a user explicitly or implicitly requested that the changes in the transaction be made permanent. Changes made by the transaction are permanent and visible to other users only after a transaction commits. The transaction shown in Figure 1 ends with a commit.
A user exits normally from most PostgreSQL Database utilities and tools, causing the current transaction to be implicitly committed. The commit behavior when a user disconnects is application-dependent and configurable.

Note: Applications should always explicitly commit or rollback transactions before program termination.
A client process terminates abnormally, causing the transaction to be implicitly rolled back using metadata stored in the transaction table and the undo.

After one transaction ends, the next executable SQL statement automatically starts the following transaction. The following example executes an UPDATE to start a transaction, ends the transaction with a ROLLBACK statement, and then executes an UPDATE to start a new transaction (note that the transaction IDs are different):

BEGIN;

UPDATE hr.employees SET salary = salary;

SELECT xid, pg_xact_status(xid) AS status
  FROM pg_current_xact_id_if_assigned() AS xid;
   xid    |   status
----------+-------------
 (6,16,1) | in progress

ROLLBACK;

SELECT pg_current_xact_id_if_assigned() AS xid;
 xid
-----


BEGIN;

UPDATE hr.employees SET last_name = last_name;

SELECT xid, pg_xact_status(xid) AS status
  FROM pg_current_xact_id_if_assigned() AS xid;
   xid    |   status
----------+-------------
 (7,16,1) | in progress

Statement Level Atomicity

Redrock Postgres supports statement-level atomicity, which means that a SQL statement is an atomic unit of work and either completely succeeds or completely fails.

A successful statement is different from a committed transaction. A single SQL statement executes successfully if the database parses and runs it without error as an atomic unit, as when all rows are changed in a multirow update.

If a SQL statement causes an error during execution, then it is not successful and so all effects of the statement are rolled back. This operation is a statement-level rollback. This operation has the following characteristics:

A SQL statement that does not succeed causes the loss only of work it would have performed itself.

The unsuccessful statement does not cause the loss of any work that preceded it in the current transaction. For example, if the execution of the second UPDATE statement in Figure 1 causes an error and is rolled back, then the work performed by the first UPDATE statement is not rolled back. The first UPDATE statement can be committed or rolled back explicitly by the user.
The effect of the rollback is as if the statement had never been run.

Any side effects of an atomic statement, for example, triggers invoked upon execution of the statement, are considered part of the atomic statement. Either all work generated as part of the atomic statement succeeds or none does.

An example of an error causing a statement-level rollback is an attempt to insert a duplicate primary key. Single SQL statements involved in a deadlock, which is competition for the same data, can also cause a statement-level rollback. However, errors discovered during SQL statement parsing, such as a syntax error, have not yet been run and so do not cause a statement-level rollback.

Logical Timestamp

A logical timestamp (logicaltime) is a logical, internal time stamp used by Redrock Postgres Database. logicaltimes order events that occur within the database, which is necessary to satisfy the ACID properties of a transaction. Redrock Postgres uses logicaltimes to mark the logicaltime before which all changes are known to be on disk so that recovery avoids applying unnecessary redo. The database also uses logicaltimes to mark the point at which no redo exists for a set of data so that recovery can stop.

logicaltimes occur in a monotonically increasing sequence. Redrock Postgres can use an logicaltime like a clock because an observed logicaltime indicates a logical point in time and repeated observations return equal or greater values. If one event has a lower logicaltime than another event, then it occurred at an earlier time with respect to the database. Several events may share the same logicaltime, which means that they occurred at the same time with respect to the database.

Every transaction has an logicaltime. For example, if a transaction updates a row, then the database records the logicaltime at which this update occurred. Other modifications in this transaction have the same logicaltime. When a transaction commits, the database records an logicaltime for this commit.

Redrock Postgres increments logicaltimes in the shared memory area. When a transaction modifies data, the database writes a new logicaltime to the undo assigned to the transaction. The log writer process then writes the commit record of the transaction immediately to the online redo log. The commit record has the unique logicaltime of the transaction. Redrock Postgres also uses logicaltimes as part of its instance recovery and media recovery mechanisms.

Overview of Transaction Control

Transaction control is the management of changes made by SQL statements and the grouping of SQL statements into transactions. In general, application designers are concerned with transaction control so that work is accomplished in logical units and data is kept consistent.

Transaction control involves using the following statements:

BEGIN initiates a transaction block, that is, all statements after a BEGIN command will be executed in a single transaction until an explicit COMMIT or ROLLBACK is given.
The COMMIT statement ends the current transaction and makes all changes performed in the transaction permanent. COMMIT also erases all savepoints in the transaction and releases transaction locks.
The ROLLBACK statement reverses the work done in the current transaction; it causes all data changes since the last COMMIT or ROLLBACK to be discarded. The ROLLBACK TO SAVEPOINT statement undoes the changes since the last savepoint but does not end the entire transaction.
The SAVEPOINT statement identifies a point in a transaction to which you can later roll back.

The session in Table 1 illustrates the basic concepts of transaction control.

Table 1 Transaction Control

Time	Session	Explanation
t0	BEGIN;	This statement begins a transaction block.
t1	UPDATE employees SET salary = 7000 WHERE last_name = ‘Banda’;	This statement updates the salary for Banda to 7000.
t2	SAVEPOINT after_banda_sal;	This statement creates a savepoint named `after_banda_sal`, enabling changes in this transaction to be rolled back to this point.
t3	UPDATE employees SET salary = 12000 WHERE last_name = ‘Greene’;	This statement updates the salary for Greene to 12000.
t4	SAVEPOINT after_greene_sal;	This statement creates a savepoint named `after_greene_sal`, enabling changes in this transaction to be rolled back to this point.
t5	ROLLBACK TO SAVEPOINT after_banda_sal;	This statement rolls back the transaction to t2, undoing the update to Greene’s salary at t3. The transaction has not ended.
t6	UPDATE employees SET salary = 11000 WHERE last_name = ‘Greene’;	This statement updates the salary for Greene to 11000 in transaction.
t7	ROLLBACK;	This statement rolls back all changes in transaction, ending the transaction.
t8	BEGIN;	This statement begins a new transaction block.
t9	UPDATE employees SET salary = 7050 WHERE last_name = ‘Banda’;	This statement updates the salary for Banda to 7050.
t10	UPDATE employees SET salary = 10950 WHERE last_name = ‘Greene’;	This statement updates the salary for Greene to 10950.
t11	COMMIT;	This statement commits all changes made in transaction, ending the transaction. The commit guarantees that the changes are saved in the online redo log files.

Active Transactions

An active transaction has started but not yet committed or rolled back. In Table 1, the first statement to modify data in the transaction is the update to Banda’s salary. From the successful execution of this update until the ROLLBACK statement ends the transaction, the transaction is active.

Data changes made by a transaction are temporary until the transaction is committed or rolled back. Before the transaction ends, the state of the data is as follows:

Redrock Postgres has generated undo data information in the shared buffers.

The undo data contains the old data values changed by the SQL statements of the transaction.
Redrock Postgres has generated redo in the shared WAL buffers.

The redo log record contains the change to the data block and the change to the undo block.
Changes have been made to the database pages in shared buffers.

The data changes for a committed transaction, stored in the database pages in shared buffers, are not necessarily written immediately to the data files by the database writer (bgwriter). The disk write can happen before or after the commit.
The rows affected by the data change are locked.

Other users cannot change the data in the affected rows, nor can they see the uncommitted changes.

Savepoints

A savepoint is a user-declared intermediate marker within the context of a transaction. Internally, this marker resolves to an undo record position. Savepoints divide a long transaction into smaller parts.

If you use savepoints in a long transaction, then you have the option later of rolling back work performed before the current point in the transaction but after a declared savepoint within the transaction. Thus, if you make an error, you do not need to resubmit every statement. Table 1 creates savepoint after_banda_sal so that the update to the Greene salary can be rolled back to this savepoint.

Rollback to Savepoint

A rollback to a savepoint in an uncommitted transaction means undoing any changes made after the specified savepoint, but it does not mean a rollback of the transaction itself. When a transaction is rolled back to a savepoint, as when the ROLLBACK TO SAVEPOINT after_banda_sal is run in Table 1, the following occurs:

Redrock Postgres rolls back only the statements run after the savepoint.

In Table 1, the ROLLBACK TO SAVEPOINT causes the UPDATE for Greene to be rolled back, but not the UPDATE for Banda.
Redrock Postgres preserves the savepoint specified in the ROLLBACK TO SAVEPOINT statement, but all subsequent savepoints are lost.

In Table 1, the ROLLBACK TO SAVEPOINT causes the after_greene_sal savepoint to be lost.
Redrock Postgres releases all table and row locks acquired after the specified savepoint but retains all data locks acquired previous to the savepoint.

The transaction remains active and can be continued.

Rollback of Transactions

A rollback of an uncommitted transaction undoes any changes to data that have been performed by SQL statements within the transaction. After a transaction has been rolled back, the effects of the work done in the transaction no longer exist.

In rolling back an entire transaction, without referencing any savepoints, Redrock Postgres performs the following actions:

Undoes all changes made by all the SQL statements in the transaction by using the corresponding undo segments

The transaction table entry for every active transaction contains a pointer to all the undo data (in reverse order of application) for the transaction. The database reads the data from the undo segment, reverses the operation, and then marks the undo entry as applied. Thus, if a transaction inserts a row, then a rollback deletes it. If a transaction updates a row, then a rollback reverses the update. If a transaction deletes a row, then a rollback reinserts it. In Table 1, the ROLLBACK reverses the updates to the salaries of Greene and Banda.
Releases all the locks of data held by the transaction
Erases all savepoints in the transaction

In Table 1, the ROLLBACK deletes the savepoint after_banda_sal. The after_greene_sal savepoint was removed by the ROLLBACK TO SAVEPOINT statement.
Ends the transaction

In Table 1, the ROLLBACK leaves the database in the same state as it was after the initial COMMIT was executed.

The duration of a rollback is a function of the amount of data modified.

Committing Transactions

A commit ends the current transaction and makes permanent all changes performed in the transaction. In Table 1, a second transaction begins with BEGIN statement and ends with an explicit COMMIT statement. The changes that resulted from the two UPDATE statements are now made permanent.

When a transaction commits, the following actions occur:

A logical timestamp (logicaltime) is generated for the COMMIT.

The internal transaction table for the associated undo tablespace records that the transaction has committed. The corresponding unique logicaltime of the transaction is assigned and recorded in the transaction table.
The session process writes remaining WAL records in the WAL buffers to the WAL log and writes the transaction logicaltime to the WAL log. This atomic event constitutes the commit of the transaction.
Redrock Postgres releases locks held on rows and tables.

Users who were enqueued waiting on locks held by the uncommitted transaction are allowed to proceed with their work.
Redrock Postgres deletes savepoints.

In Table 1, no savepoints existed in the transaction so no savepoints were erased.
Redrock Postgres performs a commit cleanout.

If modified blocks containing data from the committed transaction are still in the shared buffers, and if no other session is modifying them, then the database removes lock-related transaction information from the blocks. Ideally, the COMMIT cleans out the blocks so that a subsequent SELECT does not have to perform this task.

Note: Because a block cleanout generates redo, a query may generate redo and thus cause blocks to be written during the next checkpoint.
Redrock Postgres marks the transaction complete.

After a transaction commits, users can view the changes.

Typically, a commit is a fast operation, regardless of the transaction size. The speed of a commit does not increase with the size of the data modified in the transaction. The lengthiest part of the commit is the physical disk I/O occured in WAL writes. However, the amount of time spent in transaction commit is reduced because walwriter has been incrementally writing the contents of the WAL buffer in the background.

The default behavior is for session to write redo to the WAL log synchronously and for transactions to wait for the buffered redo to be on disk before returning a commit to the user. However, for lower transaction commit latency, application developers can specify that redo be written asynchronously so that transactions need not wait for the redo to be on disk and can return from the COMMIT call immediately.