62.2. Index Access Method Functions
The index construction and maintenance functions that an index access
method must provide in
IndexAmRoutine
are:
IndexBuildResult * ambuild (Relation heapRelation, Relation indexRelation, IndexInfo *indexInfo);
Build a new index. The index relation has been physically created,
but is empty. It must be filled in with whatever fixed data the
access method requires, plus entries for all tuples already existing
in the table. Ordinarily the
ambuild
function will call
table_index_build_scan()
to scan the table for existing tuples
and compute the keys that need to be inserted into the index.
The function must return a palloc'd struct containing statistics about
the new index.
The
amcanbuildparallel
flag indicates whether
the access method supports parallel index builds. When set to
true
,
the system will attempt to allocate parallel workers for the build.
Access methods supporting only non-parallel index builds should leave
this flag set to
false
.
void ambuildempty (Relation indexRelation);
Build an empty index, and write it to the initialization fork (
INIT_FORKNUM
)
of the given relation. This method is called only for unlogged indexes; the
empty index written to the initialization fork will be copied over the main
relation fork on each server restart.
bool aminsert (Relation indexRelation, Datum *values, bool *isnull, ItemPointer heap_tid, Relation heapRelation, IndexUniqueCheck checkUnique, bool indexUnchanged, IndexInfo *indexInfo);
Insert a new tuple into an existing index. The
values
and
isnull
arrays give the key values to be indexed, and
heap_tid
is the TID to be indexed.
If the access method supports unique indexes (its
amcanunique
flag is true) then
checkUnique
indicates the type of uniqueness check to
perform. This varies depending on whether the unique constraint is
deferrable; see
Section 62.5
for details.
Normally the access method only needs the
heapRelation
parameter when performing uniqueness checking (since then it will have to
look into the heap to verify tuple liveness).
The
indexUnchanged
Boolean value gives a hint
about the nature of the tuple to be indexed. When it is true,
the tuple is a duplicate of some existing tuple in the index. The
new tuple is a logically unchanged successor MVCC tuple version. This
happens when an
UPDATE
takes place that does not
modify any columns covered by the index, but nevertheless requires a
new version in the index. The index AM may use this hint to decide
to apply bottom-up index deletion in parts of the index where many
versions of the same logical row accumulate. Note that updating a non-key
column or a column that only appears in a partial index predicate does not
affect the value of
indexUnchanged
. The core code
determines each tuple's
indexUnchanged
value using a low
overhead approach that allows both false positives and false negatives.
Index AMs must not treat
indexUnchanged
as an
authoritative source of information about tuple visibility or versioning.
The function's Boolean result value is significant only when
checkUnique
is
UNIQUE_CHECK_PARTIAL
.
In this case a true result means the new entry is known unique, whereas
false means it might be non-unique (and a deferred uniqueness check must
be scheduled). For other cases a constant false result is recommended.
Some indexes might not index all tuples. If the tuple is not to be
indexed,
aminsert
should just return without doing anything.
If the index AM wishes to cache data across successive index insertions
within an SQL statement, it can allocate space
in
indexInfo->ii_Context
and store a pointer to the
data in
indexInfo->ii_AmCache
(which will be NULL
initially). If resources other than memory have to be released after
index insertions,
aminsertcleanup
may be provided,
which will be called before the memory is released.
void aminsertcleanup (Relation indexRelation, IndexInfo *indexInfo);
Clean up state that was maintained across successive inserts in
indexInfo->ii_AmCache
. This is useful if the data
requires additional cleanup steps (e.g., releasing pinned buffers), and
simply releasing the memory is not sufficient.
IndexBulkDeleteResult * ambulkdelete (IndexVacuumInfo *info, IndexBulkDeleteResult *stats, IndexBulkDeleteCallback callback, void *callback_state);
Delete tuple(s) from the index. This is a
"
bulk delete
"
operation
that is intended to be implemented by scanning the whole index and checking
each entry to see if it should be deleted.
The passed-in
callback
function must be called, in the style
callback(
,
to determine whether any particular index entry, as identified by its
referenced TID, is to be deleted. Must return either NULL or a palloc'd
struct containing statistics about the effects of the deletion operation.
It is OK to return NULL if no information needs to be passed on to
TID
, callback_state) returns bool
amvacuumcleanup
.
Because of limited
maintenance_work_mem
,
ambulkdelete
might need to be called more than once when many
tuples are to be deleted. The
stats
argument is the result
of the previous call for this index (it is NULL for the first call within a
VACUUM
operation). This allows the AM to accumulate statistics
across the whole operation. Typically,
ambulkdelete
will
modify and return the same struct if the passed
stats
is not
null.
IndexBulkDeleteResult * amvacuumcleanup (IndexVacuumInfo *info, IndexBulkDeleteResult *stats);
Clean up after a
VACUUM
operation (zero or more
ambulkdelete
calls). This does not have to do anything
beyond returning index statistics, but it might perform bulk cleanup
such as reclaiming empty index pages.
stats
is whatever the
last
ambulkdelete
call returned, or NULL if
ambulkdelete
was not called because no tuples needed to be
deleted. If the result is not NULL it must be a palloc'd struct.
The statistics it contains will be used to update
pg_class
,
and will be reported by
VACUUM
if
VERBOSE
is given.
It is OK to return NULL if the index was not changed at all during the
VACUUM
operation, but otherwise correct stats should
be returned.
amvacuumcleanup
will also be called at completion of an
ANALYZE
operation. In this case
stats
is always
NULL and any return value will be ignored. This case can be distinguished
by checking
info->analyze_only
. It is recommended
that the access method do nothing except post-insert cleanup in such a
call, and that only in an autovacuum worker process.
bool amcanreturn (Relation indexRelation, int attno);
Check whether the index can support
index-only scans
on
the given column, by returning the column's original indexed value.
The attribute number is 1-based, i.e., the first column's attno is 1.
Returns true if supported, else false.
This function should always return true for included columns
(if those are supported), since there's little point in an included
column that can't be retrieved.
If the access method does not support index-only scans at all,
the
amcanreturn
field in its
IndexAmRoutine
struct can be set to NULL.
void amcostestimate (PlannerInfo *root, IndexPath *path, double loop_count, Cost *indexStartupCost, Cost *indexTotalCost, Selectivity *indexSelectivity, double *indexCorrelation, double *indexPages);
Estimate the costs of an index scan. This function is described fully in Section 62.6 , below.
bytea * amoptions (ArrayType *reloptions, bool validate);
Parse and validate the reloptions array for an index. This is called only
when a non-null reloptions array exists for the index.
reloptions
is a
text
array containing entries of the
form
name
=
value
.
The function should construct a
bytea
value, which will be copied
into the
rd_options
field of the index's relcache entry.
The data contents of the
bytea
value are open for the access
method to define; most of the standard access methods use struct
StdRdOptions
.
When
validate
is true, the function should report a suitable
error message if any of the options are unrecognized or have invalid
values; when
validate
is false, invalid entries should be
silently ignored. (
validate
is false when loading options
already stored in
pg_catalog
; an invalid entry could only
be found if the access method has changed its rules for options, and in
that case ignoring obsolete entries is appropriate.)
It is OK to return NULL if default behavior is wanted.
bool amproperty (Oid index_oid, int attno, IndexAMProperty prop, const char *propname, bool *res, bool *isnull);
The
amproperty
method allows index access methods to override
the default behavior of
pg_index_column_has_property
and related functions.
If the access method does not have any special behavior for index property
inquiries, the
amproperty
field in
its
IndexAmRoutine
struct can be set to NULL.
Otherwise, the
amproperty
method will be called with
index_oid
and
attno
both zero for
pg_indexam_has_property
calls,
or with
index_oid
valid and
attno
zero for
pg_index_has_property
calls,
or with
index_oid
valid and
attno
greater than
zero for
pg_index_column_has_property
calls.
prop
is an enum value identifying the property being tested,
while
propname
is the original property name string.
If the core code does not recognize the property name
then
prop
is
AMPROP_UNKNOWN
.
Access methods can define custom property names by
checking
propname
for a match (use
pg_strcasecmp
to match, for consistency with the core code); for names known to the core
code, it's better to inspect
prop
.
If the
amproperty
method returns
true
then
it has determined the property test result: it must set
*res
to the Boolean value to return, or set
*isnull
to
true
to return a NULL. (Both of the referenced variables
are initialized to
false
before the call.)
If the
amproperty
method returns
false
then
the core code will proceed with its normal logic for determining the
property test result.
Access methods that support ordering operators should
implement
AMPROP_DISTANCE_ORDERABLE
property testing, as the
core code does not know how to do that and will return NULL. It may
also be advantageous to implement
AMPROP_RETURNABLE
testing,
if that can be done more cheaply than by opening the index and calling
amcanreturn
, which is the core code's default behavior.
The default behavior should be satisfactory for all other standard
properties.
char * ambuildphasename (int64 phasenum);
Return the textual name of the given build phase number.
The phase numbers are those reported during an index build via the
pgstat_progress_update_param
interface.
The phase names are then exposed in the
pg_stat_progress_create_index
view.
bool amvalidate (Oid opclassoid);
Validate the catalog entries for the specified operator class, so far as
the access method can reasonably do that. For example, this might include
testing that all required support functions are provided.
The
amvalidate
function must return false if the opclass is
invalid. Problems should be reported with
ereport
messages, typically at
INFO
level.
void amadjustmembers (Oid opfamilyoid, Oid opclassoid, List *operators, List *functions);
Validate proposed new operator and function members of an operator family,
so far as the access method can reasonably do that, and set their
dependency types if the default is not satisfactory. This is called
during
CREATE OPERATOR CLASS
and during
ALTER OPERATOR FAMILY ADD
; in the latter
case
opclassoid
is
InvalidOid
.
The
List
arguments are lists
of
OpFamilyMember
structs, as defined
in
amapi.h
.
Tests done by this function will typically be a subset of those
performed by
amvalidate
,
since
amadjustmembers
cannot assume that it is
seeing a complete set of members. For example, it would be reasonable
to check the signature of a support function, but not to check whether
all required support functions are provided. Any problems can be
reported by throwing an error.
The dependency-related fields of
the
OpFamilyMember
structs are initialized by
the core code to create hard dependencies on the opclass if this
is
CREATE OPERATOR CLASS
, or soft dependencies on the
opfamily if this is
ALTER OPERATOR FAMILY ADD
.
amadjustmembers
can adjust these fields if some other
behavior is more appropriate. For example, GIN, GiST, and SP-GiST
always set operator members to have soft dependencies on the opfamily,
since the connection between an operator and an opclass is relatively
weak in these index types; so it is reasonable to allow operator members
to be added and removed freely. Optional support functions are typically
also given soft dependencies, so that they can be removed if necessary.
The purpose of an index, of course, is to support scans for tuples matching
an indexable
WHERE
condition, often called a
qualifier
or
scan key
. The semantics of
index scanning are described more fully in
Section 62.3
,
below. An index access method can support
"
plain
"
index scans,
"
bitmap
"
index scans, or both. The scan-related functions that an
index access method must or may provide are:
IndexScanDesc ambeginscan (Relation indexRelation, int nkeys, int norderbys);
Prepare for an index scan. The
nkeys
and
norderbys
parameters indicate the number of quals and ordering operators that will be
used in the scan; these may be useful for space allocation purposes.
Note that the actual values of the scan keys aren't provided yet.
The result must be a palloc'd struct.
For implementation reasons the index access method
must
create this struct by calling
RelationGetIndexScan()
. In most cases
ambeginscan
does little beyond making that call and perhaps
acquiring locks;
the interesting parts of index-scan startup are in
amrescan
.
void amrescan (IndexScanDesc scan, ScanKey keys, int nkeys, ScanKey orderbys, int norderbys);
Start or restart an index scan, possibly with new scan keys. (To restart
using previously-passed keys, NULL is passed for
keys
and/or
orderbys
.) Note that it is not allowed for
the number of keys or order-by operators to be larger than
what was passed to
ambeginscan
. In practice the restart
feature is used when a new outer tuple is selected by a nested-loop join
and so a new key comparison value is needed, but the scan key structure
remains the same.
bool amgettuple (IndexScanDesc scan, ScanDirection direction);
Fetch the next tuple in the given scan, moving in the given
direction (forward or backward in the index). Returns true if a tuple was
obtained, false if no matching tuples remain. In the true case the tuple
TID is stored into the
scan
structure. Note that
"
success
"
means only that the index contains an entry that matches
the scan keys, not that the tuple necessarily still exists in the heap or
will pass the caller's snapshot test. On success,
amgettuple
must also set
scan->xs_recheck
to true or false.
False means it is certain that the index entry matches the scan keys.
True means this is not certain, and the conditions represented by the
scan keys must be rechecked against the heap tuple after fetching it.
This provision supports
"
lossy
"
index operators.
Note that rechecking will extend only to the scan conditions; a partial
index predicate (if any) is never rechecked by
amgettuple
callers.
If the index supports
index-only
scans
(i.e.,
amcanreturn
returns true for any
of its columns),
then on success the AM must also check
scan->xs_want_itup
,
and if that is true it must return the originally indexed data for the
index entry. Columns for which
amcanreturn
returns
false can be returned as nulls.
The data can be returned in the form of an
IndexTuple
pointer stored at
scan->xs_itup
,
with tuple descriptor
scan->xs_itupdesc
; or in the form of
a
HeapTuple
pointer stored at
scan->xs_hitup
,
with tuple descriptor
scan->xs_hitupdesc
. (The latter
format should be used when reconstructing data that might possibly not fit
into an
IndexTuple
.) In either case,
management of the data referenced by the pointer is the access method's
responsibility. The data must remain good at least until the next
amgettuple
,
amrescan
, or
amendscan
call for the scan.
The
amgettuple
function need only be provided if the access
method supports
"
plain
"
index scans. If it doesn't, the
amgettuple
field in its
IndexAmRoutine
struct must be set to NULL.
int64 amgetbitmap (IndexScanDesc scan, TIDBitmap *tbm);
Fetch all tuples in the given scan and add them to the caller-supplied
TIDBitmap
(that is, OR the set of tuple IDs into whatever set is already
in the bitmap). The number of tuples fetched is returned (this might be
just an approximate count, for instance some AMs do not detect duplicates).
While inserting tuple IDs into the bitmap,
amgetbitmap
can
indicate that rechecking of the scan conditions is required for specific
tuple IDs. This is analogous to the
xs_recheck
output parameter
of
amgettuple
. Note: in the current implementation, support
for this feature is conflated with support for lossy storage of the bitmap
itself, and therefore callers recheck both the scan conditions and the
partial index predicate (if any) for recheckable tuples. That might not
always be true, however.
amgetbitmap
and
amgettuple
cannot be used in the same index scan; there
are other restrictions too when using
amgetbitmap
, as explained
in
Section 62.3
.
The
amgetbitmap
function need only be provided if the access
method supports
"
bitmap
"
index scans. If it doesn't, the
amgetbitmap
field in its
IndexAmRoutine
struct must be set to NULL.
void amendscan (IndexScanDesc scan);
End a scan and release resources. The
scan
struct itself
should not be freed, but any locks or pins taken internally by the
access method must be released, as well as any other memory allocated
by
ambeginscan
and other scan-related functions.
void ammarkpos (IndexScanDesc scan);
Mark current scan position. The access method need only support one remembered scan position per scan.
The
ammarkpos
function need only be provided if the access
method supports ordered scans. If it doesn't,
the
ammarkpos
field in its
IndexAmRoutine
struct may be set to NULL.
void amrestrpos (IndexScanDesc scan);
Restore the scan to the most recently marked position.
The
amrestrpos
function need only be provided if the access
method supports ordered scans. If it doesn't,
the
amrestrpos
field in its
IndexAmRoutine
struct may be set to NULL.
In addition to supporting ordinary index scans, some types of index may wish to support parallel index scans , which allow multiple backends to cooperate in performing an index scan. The index access method should arrange things so that each cooperating process returns a subset of the tuples that would be performed by an ordinary, non-parallel index scan, but in such a way that the union of those subsets is equal to the set of tuples that would be returned by an ordinary, non-parallel index scan. Furthermore, while there need not be any global ordering of tuples returned by a parallel scan, the ordering of that subset of tuples returned within each cooperating backend must match the requested ordering. The following functions may be implemented to support parallel index scans:
Size amestimateparallelscan (int nkeys, int norderbys);
Estimate and return the number of bytes of dynamic shared memory which
the access method will be needed to perform a parallel scan. (This number
is in addition to, not in lieu of, the amount of space needed for
AM-independent data in
ParallelIndexScanDescData
.)
The
nkeys
and
norderbys
parameters indicate the number of quals and ordering operators that will be
used in the scan; the same values will be passed to
amrescan
.
Note that the actual values of the scan keys aren't provided yet.
It is not necessary to implement this function for access methods which do not support parallel scans or for which the number of additional bytes of storage required is zero.
void aminitparallelscan (void *target);
This function will be called to initialize dynamic shared memory at the
beginning of a parallel scan.
target
will point to at least
the number of bytes previously returned by
amestimateparallelscan
, and this function may use that
amount of space to store whatever data it wishes.
It is not necessary to implement this function for access methods which do not support parallel scans or in cases where the shared memory space required needs no initialization.
void amparallelrescan (IndexScanDesc scan);
This function, if implemented, will be called when a parallel index scan
must be restarted. It should reset any shared state set up by
aminitparallelscan
such that the scan will be restarted from
the beginning.