Locale support refers to an application respecting cultural preferences regarding alphabets, sorting, number formatting, etc. PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system.
Locale support is automatically initialized when a database cluster is created using
initdb will initialize the database cluster with the locale setting of its execution environment by default, so if your system is already set to use the locale that you want in your database cluster then there is nothing else you need to do. If you want to use a different locale (or you are not sure which locale your system is set to), you can instruct
initdb exactly which locale to use by specifying the
--locale option. For example:
This example for Unix systems sets the locale to Simplified Chinese (
zh) as spoken in China (
CN). Other possibilities might include
en_US (U.S. English) and
fr_CA (French Canadian). If more than one character set can be used for a locale then the specifications can take the form
language_territory.codeset. For example,
fr_BE.UTF-8 represents the French language (fr) as spoken in Belgium (BE), with a UTF-8 character set encoding.
Occasionally it is useful to mix rules from several locales, e.g., use English collation rules but Spanish messages. To support that, a set of locale subcategories exist that control only certain aspects of the localization rules:
||String sort order|
||Character classification (What is a letter? Its upper-case equivalent?)|
||Language of messages|
||Formatting of currency amounts|
||Formatting of numbers|
||Formatting of dates and times|
The category names translate into names of
initdb options to override the locale choice for a specific category. For instance, to set the locale to French Canadian, but use U.S. rules for formatting currency, use
initdb --locale=fr_CA --lc-monetary=en_US.
If you want the system to behave as if it had no locale support, use the special locale name
C, or equivalently
Some locale categories must have their values fixed when the database is created. You can use different settings for different databases, but once a database is created, you cannot change them for that database anymore.
LC_CTYPE are these categories. They affect the sort order of indexes, so they must be kept fixed, or indexes on text columns would become corrupt. (But you can alleviate this restriction using collations, as discussed in Section 23.2.) The default values for these categories are determined when
initdb is run, and those values are used when new databases are created, unless specified otherwise in the
CREATE DATABASE command.
The other locale categories can be changed whenever desired by setting the server configuration parameters that have the same name as the locale categories (see Section 19.11.2 for details). The values that are chosen by
initdb are actually only written into the configuration file
postgresql.conf to serve as defaults when the server is started. If you remove these assignments from
postgresql.conf then the server will inherit the settings from its execution environment.
Note that the locale behavior of the server is determined by the environment variables seen by the server, not by the environment of any client. Therefore, be careful to configure the correct locale settings before starting the server. A consequence of this is that if client and server are set up in different locales, messages might appear in different languages depending on where they originated.
What locales are available on your system under what names depends on what was provided by the operating system vendor and what was installed. On most Unix systems, you can list all available locales with the following command.
$ locale -a
The locale can be set by using the locale names, languages, country/region codes, and code pages that are supported by the Windows NLS API. The
locale value takes one of the following forms:
The “locale-name” form is a short, IETF-standardized string; for example,
en-US for English (United States) or
zh-CN for Simplified Chinese (People’s Republic of China). These forms are preferred. For a list of supported locale names by Windows operating system version, see the Language tag column of the table in Appendix A: Product Behavior in [MS-LCID]: Windows Language Code Identifier (LCID) Reference. This resource lists the supported language, script, and region parts of the locale names. For information about the supported locale names that have non-default sort orders, see the Locale name column in Sort order identifiers. Under Windows 10 or later, locale names that correspond to valid BCP-47 language tags are allowed. For example,
jp-US is a valid BCP-47 tag, but it’s effectively only
US for locale functionality.
The “language[_country-region[.code-page]]” form is stored in the locale setting for a category when a language string, or language string and country or region string, is used to create the locale. The set of supported language strings is described in Language strings, and the list of supported country and region strings is listed in Country/Region strings. If the specified language isn’t associated with the specified country or region, the default language for the specified country or region is stored in the locale setting. We don’t recommend this form for locale strings embedded in code or serialized to storage: These strings are more likely to be changed by an operating system update than the locale name form.
The “code-page” is the ANSI/OEM code page that’s associated with the locale. The code page is determined for you when you specify a locale by language or by language and country/region alone. The special value
.ACP specifies the ANSI code page for the country/region. The special value
.OCP specifies the OEM code page for the country/region. For example, if you specify
"Greek_Greece.ACP" as the locale, the locale is stored as
Greek_Greece.1253 (the ANSI code page for Greek), and if you specify
"Greek_Greece.OCP" as the locale, it’s stored as
Greek_Greece.737 (the OEM code page for Greek). For more information about code pages, see Code pages. For a list of supported code pages on Windows, see Code page identifiers.
If you use only the code page to specify the locale, the user’s default language and country/region are used. For example, if you specify
".1254" (ANSI Turkish) as the locale for a user that’s configured for English (United States), the locale that’s stored is
English_United States.1254. We don’t recommend this form, because it could lead to inconsistent behavior.
locale value of
C specifies the minimal ANSI conforming environment for C translation. The
C locale assumes that every
char data type is 1 byte and its value is always less than 256. If
locale points to an empty string, the locale is the implementation-defined native environment.