Physical Storage Structures
This section describes the storage format at the level of files and directories.
Traditionally, the configuration and data files used by a database instance are stored together within the instance’s data directory, commonly referred to as PGDATA
(after the name of the environment variable that can be used to define it). A common location for PGDATA
is /var/lib/pgsql/data
. Multiple database instances, managed by different server instances, can exist on the same machine.
The PGDATA
directory contains several subdirectories and control files, as shown in Table 1. In addition to these required items, the instance configuration files postgresql.conf
, pg_hba.conf
, and pg_ident.conf
are traditionally stored in PGDATA
, although it is possible to place them elsewhere.
Table 1. Contents of PGDATA
Item | Description |
---|---|
PG_VERSION |
A file containing the major version number of PostgreSQL |
base |
Subdirectory containing per-database subdirectories |
global |
Subdirectory containing instance-wide tables, such as pg_database |
pg_dynshmem |
Subdirectory containing files used by the dynamic shared memory subsystem |
pg_logical |
Subdirectory containing status data for logical decoding |
pg_notify |
Subdirectory containing LISTEN/NOTIFY status data |
pg_replslot |
Subdirectory containing replication slot data |
pg_snapshots |
Subdirectory containing exported snapshots |
pg_stat |
Subdirectory containing permanent files for the statistics subsystem |
pg_tblspc |
Subdirectory containing configuration files of tablespace storage locations |
pg_tmp |
Subdirectory that contains temporary files for instance runtime. It contains: Subdirectory caches contains temporary files for the relation cache data for process startup initialization;Subdirectory stats contains temporary files for the statistics subsystem;File current_logfiles records the log file(s) currently written to by the logging collector. |
pg_wal |
Subdirectory containing WAL (Write Ahead Log) files |
postgresql.auto.conf |
A file used for storing configuration parameters that are set by ALTER SYSTEM |
postmaster.opts |
A file recording the command-line options the server was last started with |
postmaster.pid |
A lock file recording the current postmaster process ID (PID), cluster data directory path, postmaster start timestamp, port number, Unix-domain socket directory path (could be empty), first valid listen_address (IP address or * , or empty if not listening on TCP), and shared memory segment ID (this file is not present after server shutdown) |
For each database in the instance there is a subdirectory within PGDATA/base
, named after the database’s OID in pg_database
. This subdirectory is the default location for the database’s files; in particular, its system catalogs are stored there.
Each table and index is stored in a separate file. For ordinary relations, these files are named after the table or index’s filenode number, which can be found in pg_class.relfilenode
. But for temporary relations, the file name is of the form tBBB_FFF, where BBB is the backend ID of the backend which created the file, and FFF is the filenode number. In either case, in addition to the main file (a/k/a main fork), each table and index has a free space map (see Section 68.3), which stores information about free space available in the relation. The free space map is stored in a file named with the filenode number plus the suffix _fsm
. Unlogged tables and indexes have a second fork, known as the initialization fork, which is stored in a fork with the suffix _init
(see Section 68.5).
Note that while a table’s filenode often matches its OID, this is not necessarily the case; some operations, likeTRUNCATE
,REINDEX
,CLUSTER
and some forms ofALTER TABLE
, can change the filenode while preserving the OID. Avoid assuming that filenode and table OID are the same. Also, for certain system catalogs includingpg_class
itself,pg_class.relfilenode
contains zero. The actual filenode number of these catalogs is stored in a lower-level data structure, and can be obtained using thepg_relation_filenode()
function.
When a table or index exceeds 1 GB, it is divided into gigabyte-sized segments. The first segment’s file name is the same as the filenode; subsequent segments are named filenode.1, filenode.2, etc. This arrangement avoids problems on platforms that have file size limitations. In principle, free space map forks could require multiple segments as well, though this is unlikely to happen in practice.
A table that has columns with potentially large entries will have an associated TOAST table, which is used for out-of-line storage of field values that are too large to keep in the table rows proper. pg_class.reltoastrelid
links from a table to its TOAST table, if any. See Section 68.2 for more information.
The contents of tables and indexes are discussed further in Section 68.6.
Tablespaces make the scenario more complicated. Each database has a configuration file inside the PGDATA/pg_tblspc
directory, which records the physical tablespace storage location (i.e., the location specified in the tablespace’s CREATE TABLESPACE
command). This configuration file is named after the database’s OID plus the suffix .tsm
. Inside the physical tablespace directory, there is a subdirectory for each database that has elements in the tablespace, named after the database’s OID. Tables and indexes are stored within that directory, using the filenode naming scheme.
The pg_relation_filepath()
function shows the entire path (relative to PGDATA
) of any relation. It is often useful as a substitute for remembering many of the above rules. But keep in mind that this function just gives the name of the first segment of the main fork of the relation — you may need to append a segment number and/or _fsm
, or _init
to find all the files associated with the relation.
Temporary files (for operations such as sorting more data than can fit in memory) are created within PGDATA/base/pg_tmp
, or within a pg_tmp
subdirectory of a tablespace directory if a tablespace other than pg_default
is specified for them. The name of a temporary file has the form pgtmp_PPP.NNN, where PPP is the PID of the owning backend and NNN distinguishes different temporary files of that backend.