F.22. ltree
This module implements a data type
ltree
for representing
labels of data stored in a hierarchical tree-like structure.
Extensive facilities for searching through label trees are provided.
F.22.1. Definitions
A
label
is a sequence of alphanumeric characters
and underscores (for example, in C locale the characters
A-Za-z0-9_
are allowed). Labels must be less than 256 bytes
long.
Examples:
42
,
Personal_Services
A
label path
is a sequence of zero or more
labels separated by dots, for example
L1.L2.L3
, representing
a path from the root of a hierarchical tree to a particular node. The
length of a label path must be less than 65kB, but keeping it under 2kB is
preferable. In practice this is not a major limitation; for example,
the longest label path in the DMOZ catalog (
http://www.dmoz.org
) is about 240 bytes.
Example:
Top.Countries.Europe.Russia
The
ltree
module provides several data types:
-
ltree
stores a label path. -
lquery
represents a regular-expression-like pattern for matchingltree
values. A simple word matches that label within a path. A star symbol (*
) matches zero or more labels. For example:foo Match the exact label path
foo
*.foo.* Match any label path containing the labelfoo
*.foo Match any label path whose last label isfoo
Star symbols can also be quantified to restrict how many labels they can match:
*{
n
} Match exactlyn
labels *{n
,} Match at leastn
labels *{n
,m
} Match at leastn
but not more thanm
labels *{,m
} Match at mostm
labels - same as *{0,m
}There are several modifiers that can be put at the end of a non-star label in
lquery
to make it match more than just the exact match:@ Match case-insensitively, for example
a@
matchesA
* Match any label with this prefix, for examplefoo*
matchesfoobar
% Match initial underscore-separated wordsThe behavior of
%
is a bit complicated. It tries to match words rather than the entire label. For examplefoo_bar%
matchesfoo_bar_baz
but notfoo_barbaz
. If combined with*
, prefix matching applies to each word separately, for examplefoo_bar%*
matchesfoo1_bar2_baz
but notfoo1_br2_baz
.Also, you can write several possibly-modified labels separated with
|
(OR) to match any of those labels, and you can put!
(NOT) at the start to match any label that doesn't match any of the alternatives.Here's an annotated example of
lquery
:Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain a. b. c. d. e.
This query will match any label path that:
-
begins with the label
Top
-
and next has zero to two labels before
-
a label beginning with the case-insensitive prefix
sport
-
then a label not matching
football
nortennis
-
and then ends with a label beginning with
Russ
or exactly matchingSpain
.
-
-
ltxtquery
represents a full-text-search-like pattern for matchingltree
values. Anltxtquery
value contains words, possibly with the modifiers@
,*
,%
at the end; the modifiers have the same meanings as inlquery
. Words can be combined with&
(AND),|
(OR),!
(NOT), and parentheses. The key difference fromlquery
is thatltxtquery
matches words without regard to their position in the label path.Here's an example
ltxtquery
:Europe & Russia*@ & !Transportation
This will match paths that contain the label
Europe
and any label beginning withRussia
(case-insensitive), but not paths containing the labelTransportation
. The location of these words within the path is not important. Also, when%
is used, the word can be matched to any underscore-separated word within a label, regardless of position.
Note:
ltxtquery
allows whitespace between symbols, but
ltree
and
lquery
do not.
F.22.2. Operators and Functions
Type
ltree
has the usual comparison operators
=
,
<>
,
<
,
>
,
<=
,
>=
.
Comparison sorts in the order of a tree traversal, with the children
of a node sorted by label text. In addition, the specialized
operators shown in
Table F.14
are available.
Table F.14.
ltree
Operators
Operator | Returns | Description |
---|---|---|
ltree
@>
ltree
|
boolean
|
is left argument an ancestor of right (or equal)? |
ltree
<@
ltree
|
boolean
|
is left argument a descendant of right (or equal)? |
ltree
~
lquery
|
boolean
|
does
ltree
match
lquery
?
|
lquery
~
ltree
|
boolean
|
does
ltree
match
lquery
?
|
ltree
?
lquery[]
|
boolean
|
does
ltree
match any
lquery
in array?
|
lquery[]
?
ltree
|
boolean
|
does
ltree
match any
lquery
in array?
|
ltree
@
ltxtquery
|
boolean
|
does
ltree
match
ltxtquery
?
|
ltxtquery
@
ltree
|
boolean
|
does
ltree
match
ltxtquery
?
|
ltree
||
ltree
|
ltree
|
concatenate
ltree
paths
|
ltree
||
text
|
ltree
|
convert text to
ltree
and concatenate
|
text
||
ltree
|
ltree
|
convert text to
ltree
and concatenate
|
ltree[]
@>
ltree
|
boolean
|
does array contain an ancestor of
ltree
?
|
ltree
<@
ltree[]
|
boolean
|
does array contain an ancestor of
ltree
?
|
ltree[]
<@
ltree
|
boolean
|
does array contain a descendant of
ltree
?
|
ltree
@>
ltree[]
|
boolean
|
does array contain a descendant of
ltree
?
|
ltree[]
~
lquery
|
boolean
|
does array contain any path matching
lquery
?
|
lquery
~
ltree[]
|
boolean
|
does array contain any path matching
lquery
?
|
ltree[]
?
lquery[]
|
boolean
|
does
ltree
array contain any path matching any
lquery
?
|
lquery[]
?
ltree[]
|
boolean
|
does
ltree
array contain any path matching any
lquery
?
|
ltree[]
@
ltxtquery
|
boolean
|
does array contain any path matching
ltxtquery
?
|
ltxtquery
@
ltree[]
|
boolean
|
does array contain any path matching
ltxtquery
?
|
ltree[]
?@>
ltree
|
ltree
|
first array entry that is an ancestor of
ltree
; NULL if none
|
ltree[]
?<@
ltree
|
ltree
|
first array entry that is a descendant of
ltree
; NULL if none
|
ltree[]
?~
lquery
|
ltree
|
first array entry that matches
lquery
; NULL if none
|
ltree[]
?@
ltxtquery
|
ltree
|
first array entry that matches
ltxtquery
; NULL if none
|
The operators
<@
,
@>
,
@
and
~
have analogues
^<@
,
^@>
,
^@
,
^~
, which are the same except they do not use
indexes. These are useful only for testing purposes.
The available functions are shown in Table F.15 .
Table F.15.
ltree
Functions
F.22.3. Indexes
ltree
supports several types of indexes that can speed
up the indicated operators:
-
B-tree index over
ltree
:<
,<=
,=
,>=
,>
-
GiST index over
ltree
:<
,<=
,=
,>=
,>
,@>
,<@
,@
,~
,?
Example of creating such an index:
CREATE INDEX path_gist_idx ON test USING GIST (path);
-
GiST index over
ltree[]
:ltree[] <@ ltree
,ltree @> ltree[]
,@
,~
,?
Example of creating such an index:
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
Note: This index type is lossy.
F.22.4. Example
This example uses the following data (also available in file
contrib/ltree/ltreetest.sql
in the source distribution):
CREATE TABLE test (path ltree); INSERT INTO test VALUES ('Top'); INSERT INTO test VALUES ('Top.Science'); INSERT INTO test VALUES ('Top.Science.Astronomy'); INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics'); INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology'); INSERT INTO test VALUES ('Top.Hobbies'); INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy'); INSERT INTO test VALUES ('Top.Collections'); INSERT INTO test VALUES ('Top.Collections.Pictures'); INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy'); INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars'); INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies'); INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts'); CREATE INDEX path_gist_idx ON test USING GIST (path); CREATE INDEX path_idx ON test USING BTREE (path);
Now, we have a table
test
populated with data describing
the hierarchy shown below:
Top / | \ Science Hobbies Collections / | \ Astronomy Amateurs_Astronomy Pictures / \ | Astrophysics Cosmology Astronomy / | \ Galaxies Stars Astronauts
We can do inheritance:
ltreetest=> SELECT path FROM test WHERE path <@ 'Top.Science'; path ------------------------------------ Top.Science Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology (4 rows)
Here are some examples of path matching:
ltreetest=> SELECT path FROM test WHERE path ~ '*.Astronomy.*'; path ----------------------------------------------- Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology Top.Collections.Pictures.Astronomy Top.Collections.Pictures.Astronomy.Stars Top.Collections.Pictures.Astronomy.Galaxies Top.Collections.Pictures.Astronomy.Astronauts (7 rows) ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*'; path ------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology (3 rows)
Here are some examples of full text search:
ltreetest=> SELECT path FROM test WHERE path @ 'Astro*% & !pictures@'; path ------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology Top.Hobbies.Amateurs_Astronomy (4 rows) ltreetest=> SELECT path FROM test WHERE path @ 'Astro* & !pictures@'; path ------------------------------------ Top.Science.Astronomy Top.Science.Astronomy.Astrophysics Top.Science.Astronomy.Cosmology (3 rows)
Path construction using functions:
ltreetest=> SELECT subpath(path,0,2)||'Space'||subpath(path,2) FROM test WHERE path <@ 'Top.Science.Astronomy'; ?column? ------------------------------------------ Top.Science.Space.Astronomy Top.Science.Space.Astronomy.Astrophysics Top.Science.Space.Astronomy.Cosmology (3 rows)
We could simplify this by creating a SQL function that inserts a label at a specified position in a path:
CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);' LANGUAGE SQL IMMUTABLE; ltreetest=> SELECT ins_label(path,2,'Space') FROM test WHERE path <@ 'Top.Science.Astronomy'; ins_label ------------------------------------------ Top.Science.Space.Astronomy Top.Science.Space.Astronomy.Astrophysics Top.Science.Space.Astronomy.Cosmology (3 rows)
F.22.5. Transforms
Additional extensions are available that implement transforms for
the
ltree
type for PL/Python. The extensions are
called
ltree_plpythonu
,
ltree_plpython2u
,
and
ltree_plpython3u
(see
Section 45.1
for the PL/Python naming
convention). If you install these transforms and specify them when
creating a function,
ltree
values are mapped to Python lists.
(The reverse is currently not supported, however.)
F.22.6. Authors
All work was done by Teodor Sigaev (
<
teodor@stack.net
>
) and
Oleg Bartunov (
<
oleg@sai.msu.su
>
). See
http://www.sai.msu.su/~megera/postgres/gist/
for
additional information. Authors would like to thank Eugeny Rodichev for
helpful discussions. Comments and bug reports are welcome.