Memory Architecture
This section discusses the memory architecture of an Redrock Postgres database instance.
The shared memory area is a read/write memory area that, along with the server background processes, make up a database instance. All server processes can read information in the instance shared memory area, and write to the shared memory area during database operation.
The shared buffer area is the memory area that stores copies of data blocks read from data files. A buffer is a main memory address in which the buffer manager temporarily caches a currently or recently used data block. The shared buffer area can be sized by the parameter shared_buffers. Redrock Postgres loads pages within tables, indexes and undo segments from a persistent storage to here, and operates them directly.
The database uses internal algorithms to manage buffers in the shared buffer area. A buffer can be in any of the following mutually exclusive states:
-
Unused
The buffer is available for use because it has never been used or is currently unused. This type of buffer is the easiest for the database to use.
-
Clean
This buffer was used earlier and now contains a read-consistent version of a block as of a point in time. The block contains data but is “clean” so it does not need to be checkpointed. The database can pin the block and reuse it.
-
Dirty
The buffer contain modified data that has not yet been written to disk. The database must checkpoint the block before reusing it.
Every buffer has an access mode: pinned or free (unpinned). A buffer is “pinned” in the shared buffer area so that it does not age out of memory while a user session accesses it. Multiple sessions cannot modify a pinned buffer at the same time.
When a client requests data, Redrock Postgres retrieves buffers from the shared buffer area in either of the following modes:
-
Current mode
A current mode get is a retrieval of a block as it currently appears in the shared buffer area. For example, if an uncommitted transaction has updated two rows in a block, then a current mode get retrieves the block with these uncommitted rows. The database uses current mode gets most frequently during modification statements, which must update only the current version of the block.
-
Consistent mode
A consistent read get is a retrieval of a read-consistent version of a block. This retrieval may use undo data. For example, if an uncommitted transaction has updated two rows in a block, and if a query in a separate session requests the block, then the database uses undo data to create a read-consistent version of this block (called a consistent read clone) that does not include the uncommitted updates. Typically, a query retrieves blocks in consistent mode.
A logical I/O, also known as a buffer I/O, refers to reads and writes of buffers in the shared buffer area. When a requested buffer is not found in memory, the database performs a physical I/O to copy the buffer from storage device into memory, and then a logical I/O to read the cached buffer.
The WAL buffer area used to cache WAL data that has not yet been written to disk, it can be sized by the parameter wal_buffers.
The contents of the WAL buffers are written out to disk at every transaction commit, so extremely large values are unlikely to provide a significant benefit. However, setting this value to at least a few megabytes can improve write performance on a busy server where many clients are committing at once.
The local memory area is memory specific to an operating process or thread that is not shared by other processes or threads on the system. Because the local memory area is process-specific, it is never allocated in the shared memory area.
The local buffer area within each database session. These are session-local buffers used only for access to temporary tables. A session will allocate local buffers as needed up to the limit given by temp_buffers.
The work memory area used for a query operation (such as a sort or hash table) before writing to temporary disk files, it can be limited by the parameter work_mem.
Note that for a complex query, several sort or hash operations might be running in parallel; each operation will generally be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem
; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY
, DISTINCT
, and merge joins. Hash tables are used in hash joins, hash-based aggregation, result cache nodes and hash-based processing of IN
subqueries.
The maintenance memory area used for maintenance operations, such as VACUUM
, CREATE INDEX
, and ALTER TABLE ADD FOREIGN KEY
, it can be limited by the parameter maintenance_work_mem.
When a database executes an SQL statement, it needs to parse the SQL to generate the corresponding execution plan, executes the plan and return the result. When accessing a table, it needs schema information about the table, such as the table’s column properties, object identifiers, statistics, and so on. PostgreSQL stores the schema information of the table in the system catalogs, so to access the table, you need to first obtain the schema information of the table in the system catalogs. For a PostgreSQL instance, access to system catalogs and normal table schemas is very frequent. To improve access efficiency, PostgreSQL introduced local cache area. The main parts of the local cache area are as follows:
-
Plan Cache Area
Used to cache execution plans for recently executed SQL statements. During the running of a stored procedure or function, the execution plan of the executed SQL statement is cached. Most drivers provide an interface to prepare a statement for execution, where you can create a prepared statement with parameters and bind parameter values to execute the query. These SQL statements that are prepared for queries, their execution plans are also cached. PostgreSQL provides the SQL command
DISCARD PLANS
to release all cached query plans in a session. -
Relation Cache Area
Used to cache schema information for recently accessed tables. Schema information for tables that you have accessed in a session is cached. If the accessed table is a partitioned table, the schema information of the partitions will also be cached.
-
Catalog Cache Area
Used to cache recently used tuples of system catalogs.