69.1. Database File Layout
This section describes the storage format at the level of files and directories.
  Traditionally, the configuration and data files used by a database
cluster are stored together within the cluster's data
directory, commonly referred to as
  
   PGDATA
  
  (after the name of the
environment variable that can be used to define it).  A common location for
  
   PGDATA
  
  is
  
   /var/lib/pgsql/data
  
  .  Multiple clusters,
managed by different server instances, can exist on the same machine.
 
  The
  
   PGDATA
  
  directory contains several subdirectories and control
files, as shown in
  
   Table 69.1
  
  .  In addition to
these required items, the cluster configuration files
  
   postgresql.conf
  
  ,
  
   pg_hba.conf
  
  , and
  
   pg_ident.conf
  
  are traditionally stored in
  
   PGDATA
  
  , although it is possible to place them elsewhere.
 
   
    Table 69.1. Contents of
    
     PGDATA
    
   
  
| Item | Description | 
|---|---|
| 
        PG_VERSION
        | A file containing the major version number of PostgreSQL | 
| 
        base
        | Subdirectory containing per-database subdirectories | 
| 
        current_logfiles
        | File recording the log file(s) currently written to by the logging collector | 
| 
        global
        | Subdirectory containing cluster-wide tables, such as 
        pg_database
        | 
| 
        pg_commit_ts
        | Subdirectory containing transaction commit timestamp data | 
| 
        pg_dynshmem
        | Subdirectory containing files used by the dynamic shared memory subsystem | 
| 
        pg_logical
        | Subdirectory containing status data for logical decoding | 
| 
        pg_multixact
        | Subdirectory containing multitransaction status data (used for shared row locks) | 
| 
        pg_notify
        | Subdirectory containing LISTEN/NOTIFY status data | 
| 
        pg_replslot
        | Subdirectory containing replication slot data | 
| 
        pg_serial
        | Subdirectory containing information about committed serializable transactions | 
| 
        pg_snapshots
        | Subdirectory containing exported snapshots | 
| 
        pg_stat
        | Subdirectory containing permanent files for the statistics subsystem | 
| 
        pg_stat_tmp
        | Subdirectory containing temporary files for the statistics subsystem | 
| 
        pg_subtrans
        | Subdirectory containing subtransaction status data | 
| 
        pg_tblspc
        | Subdirectory containing symbolic links to tablespaces | 
| 
        pg_twophase
        | Subdirectory containing state files for prepared transactions | 
| 
        pg_wal
        | Subdirectory containing WAL (Write Ahead Log) files | 
| 
        pg_xact
        | Subdirectory containing transaction commit status data | 
| 
        postgresql.auto.conf
        | A file used for storing configuration parameters that are set by 
        ALTER SYSTEM
        | 
| 
        postmaster.opts
        | A file recording the command-line options the server was last started with | 
| 
        postmaster.pid
        | A lock file recording the current postmaster process ID (PID),
  cluster data directory path,
  postmaster start timestamp,
  port number,
  Unix-domain socket directory path (empty on Windows),
  first valid listen_address (IP address or 
        *
       , or empty if
  not listening on TCP),
  and shared memory segment ID
  (this file is not present after server shutdown) | 
  For each database in the cluster there is a subdirectory within
  
   PGDATA
  
  
   /base
  
  , named after the database's OID in
  
   pg_database
  
  .  This subdirectory is the default location
for the database's files; in particular, its system catalogs are stored
there.
 
  Note that the following sections describe the behavior of the builtin
  
   heap
  
  
   table access method
  
  ,
 and the builtin
  
   index access methods
  
  . Due
 to the extensible nature of
  
   PostgreSQL
  
  , other
 access methods might work differently.
 
  Each table and index is stored in a separate file.  For ordinary relations,
these files are named after the table or index's
  
   filenode
  
  number,
which can be found in
  
   pg_class
  
  .
  
   relfilenode
  
  . But
for temporary relations, the file name is of the form
  
   t
   
    
  , where
  
   
     BBB
    
   
   _
   
    
     FFF
    
   
  
    BBB
   
  
  is the backend ID of the backend which created the file, and
  
   
    FFF
   
  
  is the filenode number.  In either case, in addition to the main file (a/k/a
main fork), each table and index has a
  
   free space map
  
  (see
  
   Section 69.3
  
  ), which stores information about free space available in
the relation.  The free space map is stored in a file named with the filenode
number plus the suffix
  
   _fsm
  
  .  Tables also have a
  
   visibility map
  
  , stored in a fork with the suffix
  
   _vm
  
  ,
to track which pages are known to have no dead tuples.  The visibility map is
described further in
  
   Section 69.4
  
  .  Unlogged tables and indexes
have a third fork, known as the initialization fork, which is stored in a fork
with the suffix
  
   _init
  
  (see
  
   Section 69.5
  
  ).
 
Caution
   Note that while a table's filenode often matches its OID, this is
   
    
     not
    
   
   necessarily the case; some operations, like
   
    TRUNCATE
   
   ,
   
    REINDEX
   
   ,
   
    CLUSTER
   
   and some forms
of
   
    ALTER TABLE
   
   , can change the filenode while preserving the OID.
Avoid assuming that filenode and table OID are the same.
Also, for certain system catalogs including
   
    pg_class
   
   itself,
   
    pg_class
   
   .
   
    relfilenode
   
   contains zero.  The
actual filenode number of these catalogs is stored in a lower-level data
structure, and can be obtained using the
   
    pg_relation_filenode()
   
   function.
  
  When a table or index exceeds 1 GB, it is divided into gigabyte-sized
  
   segments
  
  .  The first segment's file name is the same as the
filenode; subsequent segments are named filenode.1, filenode.2, etc.
This arrangement avoids problems on platforms that have file size limitations.
(Actually, 1 GB is just the default segment size.  The segment size can be
adjusted using the configuration option
  
   --with-segsize
  
  when building
  
   PostgreSQL
  
  .)
In principle, free space map and visibility map forks could require multiple
segments as well, though this is unlikely to happen in practice.
 
  A table that has columns with potentially large entries will have an
associated
  
   TOAST
  
  table, which is used for out-of-line storage of
field values that are too large to keep in the table rows proper.
  
   pg_class
  
  .
  
   reltoastrelid
  
  links from a table to
its
  
   TOAST
  
  table, if any.
See
  
   Section 69.2
  
  for more information.
 
The contents of tables and indexes are discussed further in Section 69.6 .
  Tablespaces make the scenario more complicated.  Each user-defined tablespace
has a symbolic link inside the
  
   PGDATA
  
  
   /pg_tblspc
  
  directory, which points to the physical tablespace directory (i.e., the
location specified in the tablespace's
  
   CREATE TABLESPACE
  
  command).
This symbolic link is named after
the tablespace's OID.  Inside the physical tablespace directory there is
a subdirectory with a name that depends on the
  
   PostgreSQL
  
  server version, such as
  
   PG_9.0_201008051
  
  .  (The reason for using
this subdirectory is so that successive versions of the database can use
the same
  
   CREATE TABLESPACE
  
  location value without conflicts.)
Within the version-specific subdirectory, there is
a subdirectory for each database that has elements in the tablespace, named
after the database's OID.  Tables and indexes are stored within that
directory, using the filenode naming scheme.
The
  
   pg_default
  
  tablespace is not accessed through
  
   pg_tblspc
  
  , but corresponds to
  
   PGDATA
  
  
   /base
  
  .  Similarly, the
  
   pg_global
  
  tablespace is not accessed through
  
   pg_tblspc
  
  , but corresponds to
  
   PGDATA
  
  
   /global
  
  .
 
  The
  
   pg_relation_filepath()
  
  function shows the entire path
(relative to
  
   PGDATA
  
  ) of any relation.  It is often useful
as a substitute for remembering many of the above rules.  But keep in
mind that this function just gives the name of the first segment of the
main fork of the relation - you may need to append a segment number
and/or
  
   _fsm
  
  ,
  
   _vm
  
  , or
  
   _init
  
  to find all
the files associated with the relation.
 
  Temporary files (for operations such as sorting more data than can fit in
memory) are created within
  
   PGDATA
  
  
   /base/pgsql_tmp
  
  ,
or within a
  
   pgsql_tmp
  
  subdirectory of a tablespace directory
if a tablespace other than
  
   pg_default
  
  is specified for them.
The name of a temporary file has the form
  
   pgsql_tmp
   
    
  ,
where
  
   
     PPP
    
   
   .
   
    
     NNN
    
   
  
    PPP
   
  
  is the PID of the owning backend and
  
   
    NNN
   
  
  distinguishes different temporary files of that backend.