An RDBMS must be able to group SQL statements so that they are either all committed, which means they are applied to the database, or all rolled back, which means they are undone. A transaction is a logical, atomic unit of work that contains one or more SQL statements.
An illustration of the need for transactions is a funds transfer from a savings account to a checking account. The transfer consists of the following separate operations:
- Decrease the savings account.
- Increase the checking account.
- Record the transaction in the transaction journal.
PostgreSQL Database guarantees that all three operations succeed or fail as a unit. For example, if a hardware failure prevents a statement in the transaction from executing, then the other statements must be rolled back.
Transactions are one of the features that sets PostgreSQL Database apart from a file system. If you perform an atomic operation that updates several files, and if the system fails halfway through, then the files will not be consistent. In contrast, a transaction moves an Postgres database from one consistent state to another. The basic principle of a transaction is “all or nothing”: an atomic operation succeeds or fails as a whole.
A transaction is a logical, atomic unit of work that contains one or more SQL statements. A transaction groups SQL statements so that they are either all committed, which means they are applied to the database, or all rolled back, which means they are undone from the database. PostgreSQL Database assigns every transaction a unique identifier called a transaction ID.
All PostgreSQL transactions comply with the basic properties of a database transaction, known as ACID properties. ACID is an acronym for the following:
All tasks of a transaction are performed or none of them are. There are no partial transactions. For example, if a transaction starts updating 100 rows, but the system fails after 20 updates, then the database rolls back the changes to these 20 rows.
The transaction takes the database from one consistent state to another consistent state. For example, in a banking transaction that debits a savings account and credits a checking account, a failure must not cause the database to credit only one account, which would lead to inconsistent data.
The effect of a transaction is not visible to other transactions until the transaction is committed. For example, one user updating the hr.employees table does not see the uncommitted changes to employees made concurrently by another user. Thus, it appears to users as if transactions are executing serially.
Changes made by committed transactions are permanent. After a transaction completes, the database ensures through its recovery mechanisms that changes from the transaction are not lost.
The use of transactions is one of the most important ways that a database management system differs from a file system.
To illustrate the concept of a transaction, consider a banking database. When a customer transfers money from a savings account to a checking account, the transaction must consist of three separate operations:
Decrement the savings account
Increment the checking account
Record the transaction in the transaction journal
PostgreSQL must allow for two situations. If all three SQL statements maintain the accounts in proper balance, then the effects of the transaction can be applied to the database. However, if a problem such as insufficient funds, invalid account number, or a hardware failure prevents one or two of the statements in the transaction from completing, then the database must roll back the entire transaction so that the balance of all accounts is correct.
Figure 1 illustrates a banking transaction. The first statement subtracts $500 from savings account 3209. The second statement adds $500 to checking account 3208. The third statement inserts a record of the transfer into the journal table. The final statement commits the transaction.
A database transaction consists of one or more statements that together constitute an atomic change to the database. The statements in it can be either Data Manipulation Language (DML) statements or Data Definition Language (DDL) statements.
A transaction has a beginning and an end.
In PostgreSQL, a transaction is set up by surrounding the SQL commands of the transaction with
COMMIT commands. So our banking transaction would actually look like:
BEGIN; UPDATE accounts SET balance = balance - 100.00 WHERE name = 'Alice'; -- etc etc COMMIT;
If, partway through the transaction, we decide we do not want to commit (perhaps we just noticed that Alice’s balance went negative), we can issue the command
ROLLBACK instead of
COMMIT, and all our updates so far will be canceled.
PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a
BEGIN command, then each individual statement has an implicit
BEGIN and (if successful)
COMMIT wrapped around it. A group of statements surrounded by
COMMIT is sometimes called a transaction block.
Some client libraries issue
COMMITcommands automatically, so that you might get the effect of transaction blocks without asking. Check the documentation for the interface you are using.
When a transaction begins, Redrock Postgres assigns the transaction to an available undo segment to record the undo entries for the new transaction. A transaction ID is not allocated until an undo segment and transaction table slot are allocated, which occurs during the first DML/DDL statement. A transaction ID is unique to a transaction and represents the undo segment number, transaction slot number, and sequence number.
The following example execute an
UPDATE statement to begin a transaction and queries for details about the transaction:
BEGIN; UPDATE hr.employees SET salary = salary; SELECT xid, pg_xact_status(xid) AS status FROM pg_current_xact_id_if_assigned() AS xid; xid | status ----------+------------- (5,16,1) | in progress
A transaction ends when any of the following actions occurs:
A user issues a
ROLLBACKstatement without a
In a commit, a user explicitly or implicitly requested that the changes in the transaction be made permanent. Changes made by the transaction are permanent and visible to other users only after a transaction commits. The transaction shown in Figure 1 ends with a commit.
A user exits normally from most PostgreSQL Database utilities and tools, causing the current transaction to be implicitly committed. The commit behavior when a user disconnects is application-dependent and configurable.
Note: Applications should always explicitly commit or rollback transactions before program termination.
A client process terminates abnormally, causing the transaction to be implicitly rolled back using metadata stored in the transaction table and the undo.
After one transaction ends, the next executable SQL statement automatically starts the following transaction. The following example executes an
UPDATE to start a transaction, ends the transaction with a
ROLLBACK statement, and then executes an
UPDATE to start a new transaction (note that the transaction IDs are different):
BEGIN; UPDATE hr.employees SET salary = salary; SELECT xid, pg_xact_status(xid) AS status FROM pg_current_xact_id_if_assigned() AS xid; xid | status ----------+------------- (6,16,1) | in progress ROLLBACK; SELECT pg_current_xact_id_if_assigned() AS xid; xid ----- BEGIN; UPDATE hr.employees SET last_name = last_name; SELECT xid, pg_xact_status(xid) AS status FROM pg_current_xact_id_if_assigned() AS xid; xid | status ----------+------------- (7,16,1) | in progress
Redrock Postgres supports statement-level atomicity, which means that a SQL statement is an atomic unit of work and either completely succeeds or completely fails.
A successful statement is different from a committed transaction. A single SQL statement executes successfully if the database parses and runs it without error as an atomic unit, as when all rows are changed in a multirow update.
If a SQL statement causes an error during execution, then it is not successful and so all effects of the statement are rolled back. This operation is a statement-level rollback. This operation has the following characteristics:
A SQL statement that does not succeed causes the loss only of work it would have performed itself.
The unsuccessful statement does not cause the loss of any work that preceded it in the current transaction. For example, if the execution of the second UPDATE statement in Figure 1 causes an error and is rolled back, then the work performed by the first UPDATE statement is not rolled back. The first UPDATE statement can be committed or rolled back explicitly by the user.
The effect of the rollback is as if the statement had never been run.
Any side effects of an atomic statement, for example, triggers invoked upon execution of the statement, are considered part of the atomic statement. Either all work generated as part of the atomic statement succeeds or none does.
An example of an error causing a statement-level rollback is an attempt to insert a duplicate primary key. Single SQL statements involved in a deadlock, which is competition for the same data, can also cause a statement-level rollback. However, errors discovered during SQL statement parsing, such as a syntax error, have not yet been run and so do not cause a statement-level rollback.
A logical timestamp (
logicaltime) is a logical, internal time stamp used by Redrock Postgres Database. logicaltimes order events that occur within the database, which is necessary to satisfy the ACID properties of a transaction. Redrock Postgres uses logicaltimes to mark the
logicaltime before which all changes are known to be on disk so that recovery avoids applying unnecessary redo. The database also uses logicaltimes to mark the point at which no redo exists for a set of data so that recovery can stop.
logicaltimes occur in a monotonically increasing sequence. Redrock Postgres can use an
logicaltime like a clock because an observed
logicaltime indicates a logical point in time and repeated observations return equal or greater values. If one event has a lower
logicaltime than another event, then it occurred at an earlier time with respect to the database. Several events may share the same
logicaltime, which means that they occurred at the same time with respect to the database.
Every transaction has an
logicaltime. For example, if a transaction updates a row, then the database records the logicaltime at which this update occurred. Other modifications in this transaction have the same
logicaltime. When a transaction commits, the database records an
logicaltime for this commit.
Redrock Postgres increments logicaltimes in the shared memory area. When a transaction modifies data, the database writes a new
logicaltime to the undo assigned to the transaction. The log writer process then writes the commit record of the transaction immediately to the online redo log. The commit record has the unique
logicaltime of the transaction. Redrock Postgres also uses logicaltimes as part of its instance recovery and media recovery mechanisms.
Transaction control is the management of changes made by SQL statements and the grouping of SQL statements into transactions. In general, application designers are concerned with transaction control so that work is accomplished in logical units and data is kept consistent.
Transaction control involves using the following statements:
BEGINinitiates a transaction block, that is, all statements after a
BEGINcommand will be executed in a single transaction until an explicit
COMMITstatement ends the current transaction and makes all changes performed in the transaction permanent.
COMMITalso erases all savepoints in the transaction and releases transaction locks.
ROLLBACKstatement reverses the work done in the current transaction; it causes all data changes since the last
ROLLBACKto be discarded. The
ROLLBACK TO SAVEPOINTstatement undoes the changes since the last savepoint but does not end the entire transaction.
SAVEPOINTstatement identifies a point in a transaction to which you can later roll back.
The session in Table 1 illustrates the basic concepts of transaction control.
Table 1 Transaction Control
|t0||BEGIN;||This statement begins a transaction block.|
|t1||UPDATE employees SET salary = 7000 WHERE last_name = ‘Banda’;||This statement updates the salary for Banda to 7000.|
|t2||SAVEPOINT after_banda_sal;||This statement creates a savepoint named
|t3||UPDATE employees SET salary = 12000 WHERE last_name = ‘Greene’;||This statement updates the salary for Greene to 12000.|
|t4||SAVEPOINT after_greene_sal;||This statement creates a savepoint named
|t5||ROLLBACK TO SAVEPOINT after_banda_sal;||This statement rolls back the transaction to t2, undoing the update to Greene’s salary at t3. The transaction has not ended.|
|t6||UPDATE employees SET salary = 11000 WHERE last_name = ‘Greene’;||This statement updates the salary for Greene to 11000 in transaction.|
|t7||ROLLBACK;||This statement rolls back all changes in transaction, ending the transaction.|
|t8||BEGIN;||This statement begins a new transaction block.|
|t9||UPDATE employees SET salary = 7050 WHERE last_name = ‘Banda’;||This statement updates the salary for Banda to 7050.|
|t10||UPDATE employees SET salary = 10950 WHERE last_name = ‘Greene’;||This statement updates the salary for Greene to 10950.|
|t11||COMMIT;||This statement commits all changes made in transaction, ending the transaction. The commit guarantees that the changes are saved in the online redo log files.|
An active transaction has started but not yet committed or rolled back. In Table 1, the first statement to modify data in the transaction is the update to Banda’s salary. From the successful execution of this update until the
ROLLBACK statement ends the transaction, the transaction is active.
Data changes made by a transaction are temporary until the transaction is committed or rolled back. Before the transaction ends, the state of the data is as follows:
Redrock Postgres has generated undo data information in the shared buffers.
The undo data contains the old data values changed by the SQL statements of the transaction.
Redrock Postgres has generated redo in the shared WAL buffers.
The redo log record contains the change to the data block and the change to the undo block.
Changes have been made to the database pages in shared buffers.
The data changes for a committed transaction, stored in the database pages in shared buffers, are not necessarily written immediately to the data files by the database writer (bgwriter). The disk write can happen before or after the commit.
The rows affected by the data change are locked.
Other users cannot change the data in the affected rows, nor can they see the uncommitted changes.
A savepoint is a user-declared intermediate marker within the context of a transaction. Internally, this marker resolves to an undo record position. Savepoints divide a long transaction into smaller parts.
If you use savepoints in a long transaction, then you have the option later of rolling back work performed before the current point in the transaction but after a declared savepoint within the transaction. Thus, if you make an error, you do not need to resubmit every statement. Table 1 creates savepoint
after_banda_sal so that the update to the Greene salary can be rolled back to this savepoint.
A rollback to a savepoint in an uncommitted transaction means undoing any changes made after the specified savepoint, but it does not mean a rollback of the transaction itself. When a transaction is rolled back to a savepoint, as when the
ROLLBACK TO SAVEPOINT after_banda_sal is run in Table 1, the following occurs:
Redrock Postgres rolls back only the statements run after the savepoint.
In Table 1, the
ROLLBACK TO SAVEPOINTcauses the
UPDATEfor Greene to be rolled back, but not the
Redrock Postgres preserves the savepoint specified in the
ROLLBACK TO SAVEPOINTstatement, but all subsequent savepoints are lost.
In Table 1, the
ROLLBACK TO SAVEPOINTcauses the
after_greene_salsavepoint to be lost.
Redrock Postgres releases all table and row locks acquired after the specified savepoint but retains all data locks acquired previous to the savepoint.
The transaction remains active and can be continued.
A rollback of an uncommitted transaction undoes any changes to data that have been performed by SQL statements within the transaction. After a transaction has been rolled back, the effects of the work done in the transaction no longer exist.
In rolling back an entire transaction, without referencing any savepoints, Redrock Postgres performs the following actions:
Undoes all changes made by all the SQL statements in the transaction by using the corresponding undo segments
The transaction table entry for every active transaction contains a pointer to all the undo data (in reverse order of application) for the transaction. The database reads the data from the undo segment, reverses the operation, and then marks the undo entry as applied. Thus, if a transaction inserts a row, then a rollback deletes it. If a transaction updates a row, then a rollback reverses the update. If a transaction deletes a row, then a rollback reinserts it. In Table 1, the
ROLLBACKreverses the updates to the salaries of Greene and Banda.
Releases all the locks of data held by the transaction
Erases all savepoints in the transaction
In Table 1, the
ROLLBACKdeletes the savepoint
after_greene_salsavepoint was removed by the
ROLLBACK TO SAVEPOINTstatement.
Ends the transaction
In Table 1, the
ROLLBACKleaves the database in the same state as it was after the initial
The duration of a rollback is a function of the amount of data modified.
A commit ends the current transaction and makes permanent all changes performed in the transaction. In Table 1, a second transaction begins with
BEGIN statement and ends with an explicit
COMMIT statement. The changes that resulted from the two
UPDATE statements are now made permanent.
When a transaction commits, the following actions occur:
A logical timestamp (
logicaltime) is generated for the
The internal transaction table for the associated undo tablespace records that the transaction has committed. The corresponding unique
logicaltimeof the transaction is assigned and recorded in the transaction table.
The session process writes remaining WAL records in the WAL buffers to the WAL log and writes the transaction
logicaltimeto the WAL log. This atomic event constitutes the commit of the transaction.
Redrock Postgres releases locks held on rows and tables.
Users who were enqueued waiting on locks held by the uncommitted transaction are allowed to proceed with their work.
Redrock Postgres deletes savepoints.
In Table 1, no savepoints existed in the transaction so no savepoints were erased.
Redrock Postgres performs a commit cleanout.
If modified blocks containing data from the committed transaction are still in the shared buffers, and if no other session is modifying them, then the database removes lock-related transaction information from the blocks. Ideally, the
COMMITcleans out the blocks so that a subsequent
SELECTdoes not have to perform this task.
Note: Because a block cleanout generates redo, a query may generate redo and thus cause blocks to be written during the next checkpoint.
Redrock Postgres marks the transaction complete.
After a transaction commits, users can view the changes.
Typically, a commit is a fast operation, regardless of the transaction size. The speed of a commit does not increase with the size of the data modified in the transaction. The lengthiest part of the commit is the physical disk I/O occured in WAL writes. However, the amount of time spent in transaction commit is reduced because
walwriter has been incrementally writing the contents of the WAL buffer in the background.
The default behavior is for session to write redo to the WAL log synchronously and for transactions to wait for the buffered redo to be on disk before returning a commit to the user. However, for lower transaction commit latency, application developers can specify that redo be written asynchronously so that transactions need not wait for the redo to be on disk and can return from the
COMMIT call immediately.