61.1. Basic API Structure for Indexes
Each index access method is described by a row in the
pg_am
system catalog. The
pg_am
entry
specifies a name and a
handler function
for the access
method. These entries can be created and deleted using the
CREATE ACCESS METHOD
and
DROP ACCESS METHOD
SQL commands.
An index access method handler function must be declared to accept a
single argument of type
internal
and to return the
pseudo-type
index_am_handler
. The argument is a dummy value that
simply serves to prevent handler functions from being called directly from
SQL commands. The result of the function must be a palloc'd struct of
type
IndexAmRoutine
, which contains everything
that the core code needs to know to make use of the index access method.
The
IndexAmRoutine
struct, also called the access
method's
API struct
, includes fields specifying assorted
fixed properties of the access method, such as whether it can support
multicolumn indexes. More importantly, it contains pointers to support
functions for the access method, which do all of the real work to access
indexes. These support functions are plain C functions and are not
visible or callable at the SQL level. The support functions are described
in
Section 61.2
.
The structure
IndexAmRoutine
is defined thus:
typedef struct IndexAmRoutine { NodeTag type; /* * Total number of strategies (operators) by which we can traverse/search * this AM. Zero if AM does not have a fixed set of strategy assignments. */ uint16 amstrategies; /* total number of support functions that this AM uses */ uint16 amsupport; /* does AM support ORDER BY indexed column's value? */ bool amcanorder; /* does AM support ORDER BY result of an operator on indexed column? */ bool amcanorderbyop; /* does AM support backward scanning? */ bool amcanbackward; /* does AM support UNIQUE indexes? */ bool amcanunique; /* does AM support multi-column indexes? */ bool amcanmulticol; /* does AM require scans to have a constraint on the first index column? */ bool amoptionalkey; /* does AM handle ScalarArrayOpExpr quals? */ bool amsearcharray; /* does AM handle IS NULL/IS NOT NULL quals? */ bool amsearchnulls; /* can index storage data type differ from column data type? */ bool amstorage; /* can an index of this type be clustered on? */ bool amclusterable; /* does AM handle predicate locks? */ bool ampredlocks; /* does AM support parallel scan? */ bool amcanparallel; /* does AM support columns included with clause INCLUDE? */ bool amcaninclude; /* type of data stored in index, or InvalidOid if variable */ Oid amkeytype; /* interface functions */ ambuild_function ambuild; ambuildempty_function ambuildempty; aminsert_function aminsert; ambulkdelete_function ambulkdelete; amvacuumcleanup_function amvacuumcleanup; amcanreturn_function amcanreturn; /* can be NULL */ amcostestimate_function amcostestimate; amoptions_function amoptions; amproperty_function amproperty; /* can be NULL */ amvalidate_function amvalidate; ambeginscan_function ambeginscan; amrescan_function amrescan; amgettuple_function amgettuple; /* can be NULL */ amgetbitmap_function amgetbitmap; /* can be NULL */ amendscan_function amendscan; ammarkpos_function ammarkpos; /* can be NULL */ amrestrpos_function amrestrpos; /* can be NULL */ /* interface functions to support parallel index scans */ amestimateparallelscan_function amestimateparallelscan; /* can be NULL */ aminitparallelscan_function aminitparallelscan; /* can be NULL */ amparallelrescan_function amparallelrescan; /* can be NULL */ } IndexAmRoutine;
To be useful, an index access method must also have one or more
operator families
and
operator classes
defined in
pg_opfamily
,
pg_opclass
,
pg_amop
, and
pg_amproc
.
These entries allow the planner
to determine what kinds of query qualifications can be used with
indexes of this access method. Operator families and classes are described
in
Section 38.15
, which is prerequisite material for reading
this chapter.
An individual index is defined by a
pg_class
entry that describes it as a physical relation, plus a
pg_index
entry that shows the logical content of the index - that is, the set
of index columns it has and the semantics of those columns, as captured by
the associated operator classes. The index columns (key values) can be
either simple columns of the underlying table or expressions over the table
rows. The index access method normally has no interest in where the index
key values come from (it is always handed precomputed key values) but it
will be very interested in the operator class information in
pg_index
. Both of these catalog entries can be
accessed as part of the
Relation
data structure that is
passed to all operations on the index.
Some of the flag fields of
IndexAmRoutine
have nonobvious
implications. The requirements of
amcanunique
are discussed in
Section 61.5
.
The
amcanmulticol
flag asserts that the
access method supports multicolumn indexes, while
amoptionalkey
asserts that it allows scans
where no indexable restriction clause is given for the first index column.
When
amcanmulticol
is false,
amoptionalkey
essentially says whether the
access method supports full-index scans without any restriction clause.
Access methods that support multiple index columns
must
support scans that omit restrictions on any or all of the columns after
the first; however they are permitted to require some restriction to
appear for the first index column, and this is signaled by setting
amoptionalkey
false.
One reason that an index AM might set
amoptionalkey
false is if it doesn't index
null values. Since most indexable operators are
strict and hence cannot return true for null inputs,
it is at first sight attractive to not store index entries for null values:
they could never be returned by an index scan anyway. However, this
argument fails when an index scan has no restriction clause for a given
index column. In practice this means that
indexes that have
amoptionalkey
true must
index nulls, since the planner might decide to use such an index
with no scan keys at all. A related restriction is that an index
access method that supports multiple index columns
must
support indexing null values in columns after the first, because the planner
will assume the index can be used for queries that do not restrict
these columns. For example, consider an index on (a,b) and a query with
WHERE a = 4
. The system will assume the index can be
used to scan for rows with
a = 4
, which is wrong if the
index omits rows where
b
is null.
It is, however, OK to omit rows where the first indexed column is null.
An index access method that does index nulls may also set
amsearchnulls
, indicating that it supports
IS NULL
and
IS NOT NULL
clauses as search
conditions.