Tablespaces

Tablespaces in Redrock Postgres allow database administrators to define locations where the files representing database objects can be stored. Once created, a tablespace can be referred to by name when creating database objects.

By using tablespaces, an administrator can control the storage layout of a Redrock Postgres instance. This is useful in at least two ways. First, if the partition or volume on which the instance was initialized runs out of space and cannot be extended, a tablespace can be created on a different partition and used until the system can be reconfigured.

Second, tablespaces allow an administrator to use knowledge of the usage pattern of database objects to optimize performance. For example, an index which is very heavily used can be placed on a very fast, highly available disk, such as an expensive solid state device. At the same time a table storing archived data which is rarely used or not performance critical could be stored on a less expensive, slower disk system.

Even though located outside the main PostgreSQL data directory, tablespaces are an integral part of the database instance and cannot be treated as an autonomous collection of data files. They are dependent on metadata contained in the main data directory, and therefore cannot be attached to a different database instance or backed up individually. Similarly, if you lose a tablespace (file deletion, disk failure, etc), the database instance might become unreadable or unable to start. Placing a tablespace on a temporary file system like a RAM disk risks the reliability of the entire instance.

Local Tablespace

PostgreSQL requires symbolic links provided by the operating system to use local tablespaces. Redrock Postgres does not have this restriction. This means that you can use local tablespaces on any operating system. To define a local tablespace, use the CREATE TABLESPACE command, for example:

CREATE TABLESPACE fastspace LOCATION '/ssd1/postgresql/data';

The directory must belong to the PostgreSQL operating system user. In addition, it is recommended that the directory be an existing empty directory. All objects subsequently created within the tablespace will be stored in files underneath this directory. The location must not be on removable or transient storage, as the instance might fail to function if the tablespace is missing or lost.

There is usually not much point in making more than one tablespace per logical file system, since you cannot control the location of individual files within a logical file system. However, PostgreSQL does not enforce any such limitation, and indeed it is not directly aware of the file system boundaries on your system. It just stores files in the directories you tell it to use.

Creation of the tablespace itself must be done as a database superuser, but after that you can allow ordinary database users to use it. To do that, grant them the CREATE privilege on it.

Network Attached Tablespace

Redrock Postgres supports database servers running in storage server mode and can provide storage access services to other database servers connected to the network. We call these servers used to store and access data as database storage servers, and those servers that are mainly used for computing and processing data are called database computing servers. The computing server communicates with the storage server using the PostgreSQL frontend/backend protocol.

We can create a network tablespace on the storage server and then refer to it by tablespace name when creating database objects. If you have finished deploying the storage server, you can create a network tablespace on the storage server. To define a network attached tablespace, use the CREATE TABLESPACE command, for example:

CREATE TABLESPACE netspace STORAGE remote LOCATION
  'host=192.168.1.50 port=5432 user=postgres password=pgpass';

All objects created in this tablespace are stored in files in the data directory of the storage server. During database service running, you need to ensure that the storage server is always accessible, as the instance might fail to function if the tablespace is missing or lost.

Using Tablespaces

Tables, indexes, undos and entire databases can be assigned to particular tablespaces. To do so, a user with the CREATE privilege on a given tablespace must pass the tablespace name as a parameter to the relevant command. For example, the following creates a table in the tablespace space1:

CREATE TABLE foo(i int) TABLESPACE space1;

Alternatively, use the default_tablespace parameter:

SET default_tablespace = space1;
CREATE TABLE foo(i int);

When default_tablespace is set to anything but an empty string, it supplies an implicit TABLESPACE clause for CREATE TABLE and CREATE INDEX commands that do not have an explicit one.

There is also a temp_tablespaces parameter, which determines the placement of temporary tables and indexes, as well as temporary files that are used for purposes such as sorting large data sets. This can be a list of tablespace names, rather than only one, so that the load associated with temporary objects can be spread over multiple tablespaces. A random member of the list is picked each time a temporary object is to be created.

In Redrock Postgres, when a database is created, a default tablespace is created in the database. The name of the tablespace is pg_default. This tablespace will be used to store the system catalogs of the database. Furthermore, it is the default tablespace used for tables, indexes, undos, and temporary files created within the database, if no TABLESPACE clause is given and no other selection is specified by default_tablespace or temp_tablespaces (as appropriate). If a database is created without specifying tablespace or location information, it will create a new tablespace with the same location information as its template database.

When the database instance is initialized, two tablespaces are automatically created for each database. The pg_global tablespace is used for instance shared system catalogs, and the initial location of this tablespace is in the $PGDATA/global directory. The pg_default tablespace is used for database-level system catalogs and is initially located in the $PGDATA/base directory.

In PostgreSQL, tablespaces are instance-level objects, while in Redrock Postgres, tablespaces are database-level objects. This means that when you access different database objects in the same instance, although you may see tablespaces with the same name or location information, in fact, they are different tablespace objects in their respective databases. When Redrock Postgres creates a tablespace, it will create a directory named after the database object identifier under the specified location, so that different databases can also have their own tablespaces under the same location.

Managing Tablespaces

Once created, a tablespace can be used by any user in the database, provided the requesting user has sufficient privilege. This means that a tablespace cannot be dropped until all objects using the tablespace have been removed.

To remove an empty tablespace, use the DROP TABLESPACE command.

To determine the set of existing tablespaces, examine the pg_tablespace system catalog, for example

SELECT spcname FROM pg_tablespace;

The psql program’s \db meta-command is also useful for listing the existing tablespaces.

The $PGDATA/pg_tblspc directory stores the storage location configuration files of tablespaces in each database. These files are named in the format of [db_oid].tsm. Although not recommended, it is possible to adjust the tablespace layout by manually modifying these storage location profiles. Under no circumstances perform this operation while the server is running.