7.2. Table Expressions
  A
  
   table expression
  
  computes a table.  The
   table expression contains a
  
   FROM
  
  clause that is
   optionally followed by
  
   WHERE
  
  ,
  
   GROUP BY
  
  , and
  
   HAVING
  
  clauses.  Trivial table expressions simply refer
   to a table on disk, a so-called base table, but more complex
   expressions can be used to modify or combine base tables in various
   ways.
 
  The optional
  
   WHERE
  
  ,
  
   GROUP BY
  
  , and
  
   HAVING
  
  clauses in the table expression specify a
   pipeline of successive transformations performed on the table
   derived in the
  
   FROM
  
  clause.  All these transformations
   produce a virtual table that provides the rows that are passed to
   the select list to compute the output rows of the query.
 
      7.2.1. The
      
       FROM
      
      Clause
     
    
   The
   
    
     FROM
    
   
   clause derives a
    table from one or more other tables given in a comma-separated
    table reference list.
  
FROMtable_reference[,table_reference[, ...]]
   A table reference can be a table name (possibly schema-qualified),
    or a derived table such as a subquery, a
   
    JOIN
   
   construct, or
    complex combinations of these.  If more than one table reference is
    listed in the
   
    FROM
   
   clause, the tables are cross-joined
    (that is, the Cartesian product of their rows is formed; see below).
    The result of the
   
    FROM
   
   list is an intermediate virtual
    table that can then be subject to
    transformations by the
   
    WHERE
   
   ,
   
    GROUP BY
   
   ,
    and
   
    HAVING
   
   clauses and is finally the result of the
    overall table expression.
  
   When a table reference names a table that is the parent of a
    table inheritance hierarchy, the table reference produces rows of
    not only that table but all of its descendant tables, unless the
    key word
   
    ONLY
   
   precedes the table name.  However, the
    reference produces only the columns that appear in the named table
    - any columns added in subtables are ignored.
  
   Instead of writing
   
    ONLY
   
   before the table name, you can write
   
    *
   
   after the table name to explicitly specify that descendant
    tables are included.  There is no real reason to use this syntax any more,
    because searching descendant tables is now always the default behavior.
    However, it is supported for compatibility with older releases.
  
7.2.1.1. Joined Tables
A joined table is a table derived from two other (real or derived) tables according to the rules of the particular join type. Inner, outer, and cross-joins are available. The general syntax of a joined table is
T1join_typeT2[join_condition]
    Joins of all types can be chained together, or nested: either or
     both
    
     
      T1
     
    
    and
    
     
      T2
     
    
    can be joined tables.  Parentheses
     can be used around
    
     JOIN
    
    clauses to control the join
     order.  In the absence of parentheses,
    
     JOIN
    
    clauses
     nest left-to-right.
   
Join Types
- Cross join
- 
      T1CROSS JOINT2For every possible combination of rows from T1andT2(i.e., a Cartesian product), the joined table will contain a row consisting of all columns inT1followed by all columns inT2. If the tables have N and M rows respectively, the joined table will have N * M rows.FROMis equivalent toT1CROSS JOINT2FROM(see below). It is also equivalent toT1INNER JOINT2ON TRUEFROM.T1,T2NoteThis latter equivalence does not hold exactly when more than two tables appear, because JOINbinds more tightly than comma. For exampleFROMis not the same asT1CROSS JOINT2INNER JOINT3ONconditionFROMbecause theT1,T2INNER JOINT3ONconditionconditioncan referenceT1in the first case but not the second.
- Qualified joins
- 
      T1{ [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOINT2ONboolean_expressionT1{ [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOINT2USING (join column list)T1NATURAL { [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOINT2The words INNERandOUTERare optional in all forms.INNERis the default;LEFT,RIGHT, andFULLimply an outer join.The join condition is specified in the ONorUSINGclause, or implicitly by the wordNATURAL. The join condition determines which rows from the two source tables are considered to " match " , as explained in detail below.The possible types of qualified join are: - 
         
          INNER JOIN
- 
         For each row R1 of T1, the joined table has a row for each row in T2 that satisfies the join condition with R1. 
- 
         
          LEFT OUTER JOIN
- 
         First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition with any row in T2, a joined row is added with null values in columns of T2. Thus, the joined table always has at least one row for each row in T1. 
- 
         
          RIGHT OUTER JOIN
- 
         First, an inner join is performed. Then, for each row in T2 that does not satisfy the join condition with any row in T1, a joined row is added with null values in columns of T1. This is the converse of a left join: the result table will always have a row for each row in T2. 
- 
         
          FULL OUTER JOIN
- 
         First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition with any row in T2, a joined row is added with null values in columns of T2. Also, for each row of T2 that does not satisfy the join condition with any row in T1, a joined row with null values in the columns of T1 is added. 
 The ONclause is the most general kind of join condition: it takes a Boolean value expression of the same kind as is used in aWHEREclause. A pair of rows fromT1andT2match if theONexpression evaluates to true.The USINGclause is a shorthand that allows you to take advantage of the specific situation where both sides of the join use the same name for the joining column(s). It takes a comma-separated list of the shared column names and forms a join condition that includes an equality comparison for each one. For example, joiningT1andT2withUSING (a, b)produces the join conditionON.T1.a =T2.a ANDT1.b =T2.bFurthermore, the output of JOIN USINGsuppresses redundant columns: there is no need to print both of the matched columns, since they must have equal values. WhileJOIN ONproduces all columns fromT1followed by all columns fromT2,JOIN USINGproduces one output column for each of the listed column pairs (in the listed order), followed by any remaining columns fromT1, followed by any remaining columns fromT2.Finally, NATURALis a shorthand form ofUSING: it forms aUSINGlist consisting of all column names that appear in both input tables. As withUSING, these columns appear only once in the output table. If there are no common column names,NATURAL JOINbehaves likeJOIN ... ON TRUE, producing a cross-product join.NoteUSINGis reasonably safe from column changes in the joined relations since only the listed columns are combined.NATURALis considerably more risky since any schema changes to either relation that cause a new matching column name to be present will cause the join to combine that new column as well.
- 
         
          
    To put this together, assume we have tables
    
     t1
    
    :
   
num | name -----+------ 1 | a 2 | b 3 | c
    and
    
     t2
    
    :
   
num | value -----+------- 1 | xxx 3 | yyy 5 | zzz
then we get the following results for the various joins:
=>SELECT * FROM t1 CROSS JOIN t2;num | name | num | value -----+------+-----+------- 1 | a | 1 | xxx 1 | a | 3 | yyy 1 | a | 5 | zzz 2 | b | 1 | xxx 2 | b | 3 | yyy 2 | b | 5 | zzz 3 | c | 1 | xxx 3 | c | 3 | yyy 3 | c | 5 | zzz (9 rows)=>SELECT * FROM t1 INNER JOIN t2 ON t1.num = t2.num;num | name | num | value -----+------+-----+------- 1 | a | 1 | xxx 3 | c | 3 | yyy (2 rows)=>SELECT * FROM t1 INNER JOIN t2 USING (num);num | name | value -----+------+------- 1 | a | xxx 3 | c | yyy (2 rows)=>SELECT * FROM t1 NATURAL INNER JOIN t2;num | name | value -----+------+------- 1 | a | xxx 3 | c | yyy (2 rows)=>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num;num | name | num | value -----+------+-----+------- 1 | a | 1 | xxx 2 | b | | 3 | c | 3 | yyy (3 rows)=>SELECT * FROM t1 LEFT JOIN t2 USING (num);num | name | value -----+------+------- 1 | a | xxx 2 | b | 3 | c | yyy (3 rows)=>SELECT * FROM t1 RIGHT JOIN t2 ON t1.num = t2.num;num | name | num | value -----+------+-----+------- 1 | a | 1 | xxx 3 | c | 3 | yyy | | 5 | zzz (3 rows)=>SELECT * FROM t1 FULL JOIN t2 ON t1.num = t2.num;num | name | num | value -----+------+-----+------- 1 | a | 1 | xxx 2 | b | | 3 | c | 3 | yyy | | 5 | zzz (4 rows)
    The join condition specified with
    
     ON
    
    can also contain
     conditions that do not relate directly to the join.  This can
     prove useful for some queries but needs to be thought out
     carefully.  For example:
   
=>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';num | name | num | value -----+------+-----+------- 1 | a | 1 | xxx 2 | b | | 3 | c | | (3 rows)
    Notice that placing the restriction in the
    
     WHERE
    
    clause
     produces a different result:
   
=>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num WHERE t2.value = 'xxx';num | name | num | value -----+------+-----+------- 1 | a | 1 | xxx (1 row)
    This is because a restriction placed in the
    
     ON
    
    clause is processed
    
     
      before
     
    
    the join, while
     a restriction placed in the
    
     WHERE
    
    clause is processed
    
     
      after
     
    
    the join.
     That does not matter with inner joins, but it matters a lot with outer
     joins.
   
7.2.1.2. Table and Column Aliases
A temporary name can be given to tables and complex table references to be used for references to the derived table in the rest of the query. This is called a table alias .
To create a table alias, write
FROMtable_referenceASalias
or
FROMtable_referencealias
    The
    
     AS
    
    key word is optional noise.
    
     
      alias
     
    
    can be any identifier.
   
A typical application of table aliases is to assign short identifiers to long table names to keep the join clauses readable. For example:
SELECT * FROM some_very_long_table_name s JOIN another_fairly_long_name a ON s.id = a.num;
The alias becomes the new name of the table reference so far as the current query is concerned - it is not allowed to refer to the table by the original name elsewhere in the query. Thus, this is not valid:
SELECT * FROM my_table AS m WHERE my_table.a > 5; -- wrong
Table aliases are mainly for notational convenience, but it is necessary to use them when joining a table to itself, e.g.:
SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id;
Additionally, an alias is required if the table reference is a subquery (see Section 7.2.1.3 ).
    Parentheses are used to resolve ambiguities.  In the following example,
     the first statement assigns the alias
    
     b
    
    to the second
     instance of
    
     my_table
    
    , but the second statement assigns the
     alias to the result of the join:
   
SELECT * FROM my_table AS a CROSS JOIN my_table AS b ... SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
Another form of table aliasing gives temporary names to the columns of the table, as well as the table itself:
FROMtable_reference[AS]alias(column1[,column2[, ...]] )
If fewer column aliases are specified than the actual table has columns, the remaining columns are not renamed. This syntax is especially useful for self-joins or subqueries.
    When an alias is applied to the output of a
    
     JOIN
    
    clause, the alias hides the original
     name(s) within the
    
     JOIN
    
    .  For example:
   
SELECT a.* FROM my_table AS a JOIN your_table AS b ON ...
is valid SQL, but:
SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
    is not valid; the table alias
    
     a
    
    is not visible
     outside the alias
    
     c
    
    .
   
7.2.1.3. Subqueries
Subqueries specifying a derived table must be enclosed in parentheses and must be assigned a table alias name (as in Section 7.2.1.2 ). For example:
FROM (SELECT * FROM table1) AS alias_name
    This example is equivalent to
    
     FROM table1 AS
     alias_name
    
    .  More interesting cases, which cannot be
     reduced to a plain join, arise when the subquery involves
     grouping or aggregation.
   
    A subquery can also be a
    
     VALUES
    
    list:
   
FROM (VALUES ('anne', 'smith'), ('bob', 'jones'), ('joe', 'blow'))
     AS names(first, last)
   
    Again, a table alias is required.  Assigning alias names to the columns
     of the
    
     VALUES
    
    list is optional, but is good practice.
     For more information see
    
     Section 7.7
    
    .
   
7.2.1.4. Table Functions
    Table functions are functions that produce a set of rows, made up
     of either base data types (scalar types) or composite data types
     (table rows).  They are used like a table, view, or subquery in
     the
    
     FROM
    
    clause of a query. Columns returned by table
     functions can be included in
    
     SELECT
    
    ,
    
     JOIN
    
    , or
    
     WHERE
    
    clauses in the same manner
     as columns of a table, view, or subquery.
   
    Table functions may also be combined using the
    
     ROWS FROM
    
    syntax, with the results returned in parallel columns; the number of
     result rows in this case is that of the largest function result, with
     smaller results padded with null values to match.
   
function_call[WITH ORDINALITY] [[AS]table_alias[(column_alias[, ... ])]] ROWS FROM(function_call[, ... ] ) [WITH ORDINALITY] [[AS]table_alias[(column_alias[, ... ])]]
    If the
    
     WITH ORDINALITY
    
    clause is specified, an
     additional column of type
    
     bigint
    
    will be added to the
     function result columns.  This column numbers the rows of the function
     result set, starting from 1. (This is a generalization of the
     SQL-standard syntax for
    
     UNNEST ... WITH ORDINALITY
    
    .)
     By default, the ordinal column is called
    
     ordinality
    
    , but
     a different column name can be assigned to it using
     an
    
     AS
    
    clause.
   
    The special table function
    
     UNNEST
    
    may be called with
     any number of array parameters, and it returns a corresponding number of
     columns, as if
    
     UNNEST
    
    (
    
     Section 9.19
    
    ) had been called on each parameter
     separately and combined using the
    
     ROWS FROM
    
    construct.
   
UNNEST(array_expression[, ... ] ) [WITH ORDINALITY] [[AS]table_alias[(column_alias[, ... ])]]
    If no
    
     
      table_alias
     
    
    is specified, the function
     name is used as the table name; in the case of a
    
     ROWS FROM()
    
    construct, the first function's name is used.
   
If column aliases are not supplied, then for a function returning a base data type, the column name is also the same as the function name. For a function returning a composite type, the result columns get the names of the individual attributes of the type.
Some examples:
CREATE TABLE foo (fooid int, foosubid int, fooname text);
CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
    SELECT * FROM foo WHERE fooid = $1;
$$ LANGUAGE SQL;
SELECT * FROM getfoo(1) AS t1;
SELECT * FROM foo
    WHERE foosubid IN (
                        SELECT foosubid
                        FROM getfoo(foo.fooid) z
                        WHERE z.fooid = foo.fooid
                      );
CREATE VIEW vw_getfoo AS SELECT * FROM getfoo(1);
SELECT * FROM vw_getfoo;
   
    In some cases it is useful to define table functions that can
     return different column sets depending on how they are invoked.
     To support this, the table function can be declared as returning
     the pseudo-type
    
     record
    
    with no
    
     OUT
    
    parameters.  When such a function is used in
     a query, the expected row structure must be specified in the
     query itself, so that the system can know how to parse and plan
     the query.  This syntax looks like:
   
function_call[AS]alias(column_definition[, ... ])function_callAS [alias] (column_definition[, ... ]) ROWS FROM( ...function_callAS (column_definition[, ... ]) [, ... ] )
    When not using the
    
     ROWS FROM()
    
    syntax,
     the
    
     
      column_definition
     
    
    list replaces the column
     alias list that could otherwise be attached to the
    
     FROM
    
    item; the names in the column definitions serve as column aliases.
     When using the
    
     ROWS FROM()
    
    syntax,
     a
    
     
      column_definition
     
    
    list can be attached to
     each member function separately; or if there is only one member function
     and no
    
     WITH ORDINALITY
    
    clause,
     a
    
     
      column_definition
     
    
    list can be written in
     place of a column alias list following
    
     ROWS FROM()
    
    .
   
Consider this example:
SELECT *
    FROM dblink('dbname=mydb', 'SELECT proname, prosrc FROM pg_proc')
      AS t1(proname name, prosrc text)
    WHERE proname LIKE 'bytea%';
   
    The
    
     
      dblink
     
    
    function
     (part of the
    
     dblink
    
    module) executes
     a remote query.  It is declared to return
    
     record
    
    since it might be used for any kind of query.
     The actual column set must be specified in the calling query so
     that the parser knows, for example, what
    
     *
    
    should
     expand to.
   
    This example uses
    
     ROWS FROM
    
    :
   
SELECT *
FROM ROWS FROM
    (
        json_to_recordset('[{"a":40,"b":"foo"},{"a":"100","b":"bar"}]')
            AS (a INTEGER, b TEXT),
        generate_series(1, 3)
    ) AS x (p, q, s)
ORDER BY p;
  p  |  q  | s
-----+-----+---
  40 | foo | 1
 100 | bar | 2
     |     | 3
   
    It joins two functions into a single
    
     FROM
    
    target.
    
     json_to_recordset()
    
    is instructed
     to return two columns, the first
    
     integer
    
    and the second
    
     text
    
    .  The result of
    
     generate_series()
    
    is used directly.
     The
    
     ORDER BY
    
    clause sorts the column values
     as integers.
   
       7.2.1.5. 
       
        LATERAL
       
       Subqueries
      
     
    Subqueries appearing in
    
     FROM
    
    can be
     preceded by the key word
    
     LATERAL
    
    .  This allows them to
     reference columns provided by preceding
    
     FROM
    
    items.
     (Without
    
     LATERAL
    
    , each subquery is
     evaluated independently and so cannot cross-reference any other
    
     FROM
    
    item.)
   
    Table functions appearing in
    
     FROM
    
    can also be
     preceded by the key word
    
     LATERAL
    
    , but for functions the
     key word is optional; the function's arguments can contain references
     to columns provided by preceding
    
     FROM
    
    items in any case.
   
    A
    
     LATERAL
    
    item can appear at top level in the
    
     FROM
    
    list, or within a
    
     JOIN
    
    tree.  In the latter
     case it can also refer to any items that are on the left-hand side of a
    
     JOIN
    
    that it is on the right-hand side of.
   
    When a
    
     FROM
    
    item contains
    
     LATERAL
    
    cross-references, evaluation proceeds as follows: for each row of the
    
     FROM
    
    item providing the cross-referenced column(s), or
     set of rows of multiple
    
     FROM
    
    items providing the
     columns, the
    
     LATERAL
    
    item is evaluated using that
     row or row set's values of the columns.  The resulting row(s) are
     joined as usual with the rows they were computed from.  This is
     repeated for each row or set of rows from the column source table(s).
   
    A trivial example of
    
     LATERAL
    
    is
   
SELECT * FROM foo, LATERAL (SELECT * FROM bar WHERE bar.id = foo.bar_id) ss;
This is not especially useful since it has exactly the same result as the more conventional
SELECT * FROM foo, bar WHERE bar.id = foo.bar_id;
    
     LATERAL
    
    is primarily useful when the cross-referenced
     column is necessary for computing the row(s) to be joined.  A common
     application is providing an argument value for a set-returning function.
     For example, supposing that
    
     vertices(polygon)
    
    returns the
     set of vertices of a polygon, we could identify close-together vertices
     of polygons stored in a table with:
   
SELECT p1.id, p2.id, v1, v2
FROM polygons p1, polygons p2,
     LATERAL vertices(p1.poly) v1,
     LATERAL vertices(p2.poly) v2
WHERE (v1 <-> v2) < 10 AND p1.id != p2.id;
   This query could also be written
SELECT p1.id, p2.id, v1, v2
FROM polygons p1 CROSS JOIN LATERAL vertices(p1.poly) v1,
     polygons p2 CROSS JOIN LATERAL vertices(p2.poly) v2
WHERE (v1 <-> v2) < 10 AND p1.id != p2.id;
   
    or in several other equivalent formulations.  (As already mentioned,
     the
    
     LATERAL
    
    key word is unnecessary in this example, but
     we use it for clarity.)
   
    It is often particularly handy to
    
     LEFT JOIN
    
    to a
    
     LATERAL
    
    subquery, so that source rows will appear in
     the result even if the
    
     LATERAL
    
    subquery produces no
     rows for them.  For example, if
    
     get_product_names()
    
    returns
     the names of products made by a manufacturer, but some manufacturers in
     our table currently produce no products, we could find out which ones
     those are like this:
   
SELECT m.name FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true WHERE pname IS NULL;
      7.2.2. The
      
       WHERE
      
      Clause
     
    
   The syntax of the
   
    
     WHERE
    
   
   clause is
  
WHERE search_condition
  
   where
   
    
     search_condition
    
   
   is any value
    expression (see
   
    Section 4.2
   
   ) that
    returns a value of type
   
    boolean
   
   .
  
   After the processing of the
   
    FROM
   
   clause is done, each
    row of the derived virtual table is checked against the search
    condition.  If the result of the condition is true, the row is
    kept in the output table, otherwise (i.e., if the result is
    false or null) it is discarded.  The search condition typically
    references at least one column of the table generated in the
   
    FROM
   
   clause; this is not required, but otherwise the
   
    WHERE
   
   clause will be fairly useless.
  
Note
    The join condition of an inner join can be written either in
     the
    
     WHERE
    
    clause or in the
    
     JOIN
    
    clause.
     For example, these table expressions are equivalent:
   
FROM a, b WHERE a.id = b.id AND b.val > 5
and:
FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5
or perhaps even:
FROM a NATURAL JOIN b WHERE b.val > 5
    Which one of these you use is mainly a matter of style.  The
    
     JOIN
    
    syntax in the
    
     FROM
    
    clause is
     probably not as portable to other SQL database management systems,
     even though it is in the SQL standard.  For
     outer joins there is no choice:  they must be done in
     the
    
     FROM
    
    clause.  The
    
     ON
    
    or
    
     USING
    
    clause of an outer join is
    
     
      not
     
    
    equivalent to a
    
     WHERE
    
    condition, because it results in the addition
     of rows (for unmatched input rows) as well as the removal of rows
     in the final result.
   
   Here are some examples of
   
    WHERE
   
   clauses:
  
SELECT ... FROM fdt WHERE c1 > 5 SELECT ... FROM fdt WHERE c1 IN (1, 2, 3) SELECT ... FROM fdt WHERE c1 IN (SELECT c1 FROM t2) SELECT ... FROM fdt WHERE c1 IN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) AND 100 SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 > fdt.c1)
   
    fdt
   
   is the table derived in the
   
    FROM
   
   clause. Rows that do not meet the search
    condition of the
   
    WHERE
   
   clause are eliminated from
   
    fdt
   
   . Notice the use of scalar subqueries as
    value expressions.  Just like any other query, the subqueries can
    employ complex table expressions.  Notice also how
   
    fdt
   
   is referenced in the subqueries.
    Qualifying
   
    c1
   
   as
   
    fdt.c1
   
   is only necessary
    if
   
    c1
   
   is also the name of a column in the derived
    input table of the subquery.  But qualifying the column name adds
    clarity even when it is not needed.  This example shows how the column
    naming scope of an outer query extends into its inner queries.
  
      7.2.3. The
      
       GROUP BY
      
      and
      
       HAVING
      
      Clauses
     
    
   After passing the
   
    WHERE
   
   filter, the derived input
    table might be subject to grouping, using the
   
    GROUP BY
   
   clause, and elimination of group rows using the
   
    HAVING
   
   clause.
  
SELECTselect_listFROM ... [WHERE ...] GROUP BYgrouping_column_reference[,grouping_column_reference]...
   The
   
    
     GROUP BY
    
   
   clause is
    used to group together those rows in a table that have the same
    values in all the columns listed. The order in which the columns
    are listed does not matter.  The effect is to combine each set
    of rows having common values into one group row that
    represents all rows in the group.  This is done to
    eliminate redundancy in the output and/or compute aggregates that
    apply to these groups.  For instance:
  
=>SELECT * FROM test1;x | y ---+--- a | 3 c | 2 b | 5 a | 1 (4 rows)=>SELECT x FROM test1 GROUP BY x;x --- a b c (3 rows)
   In the second query, we could not have written
   
    SELECT *
    FROM test1 GROUP BY x
   
   , because there is no single value
    for the column
   
    y
   
   that could be associated with each
    group.  The grouped-by columns can be referenced in the select list since
    they have a single value in each group.
  
   In general, if a table is grouped, columns that are not
    listed in
   
    GROUP BY
   
   cannot be referenced except in aggregate
    expressions.  An example with aggregate expressions is:
  
=>SELECT x, sum(y) FROM test1 GROUP BY x;x | sum ---+----- a | 4 b | 5 c | 2 (3 rows)
   Here
   
    sum
   
   is an aggregate function that
    computes a single value over the entire group.  More information
    about the available aggregate functions can be found in
   
    Section 9.21
   
   .
  
Tip
    Grouping without aggregate expressions effectively calculates the
     set of distinct values in a column.  This can also be achieved
     using the
    
     DISTINCT
    
    clause (see
    
     Section 7.3.3
    
    ).
   
Here is another example: it calculates the total sales for each product (rather than the total sales of all products):
SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
    FROM products p LEFT JOIN sales s USING (product_id)
    GROUP BY product_id, p.name, p.price;
  
   In this example, the columns
   
    product_id
   
   ,
   
    p.name
   
   , and
   
    p.price
   
   must be
    in the
   
    GROUP BY
   
   clause since they are referenced in
    the query select list (but see below).  The column
   
    s.units
   
   does not have to be in the
   
    GROUP
    BY
   
   list since it is only used in an aggregate expression
    (
   
    sum(...)
   
   ), which represents the sales
    of a product.  For each product, the query returns a summary row about
    all sales of the product.
  
   If the products table is set up so that, say,
   
    product_id
   
   is the primary key, then it would be
    enough to group by
   
    product_id
   
   in the above example,
    since name and price would be
   
    functionally
    dependent
   
   on the product ID, and so there would be no
    ambiguity about which name and price value to return for each product
    ID group.
  
   In strict SQL,
   
    GROUP BY
   
   can only group by columns of
    the source table but
   
    PostgreSQL
   
   extends
    this to also allow
   
    GROUP BY
   
   to group by columns in the
    select list.  Grouping by value expressions instead of simple
    column names is also allowed.
  
   If a table has been grouped using
   
    GROUP BY
   
   ,
    but only certain groups are of interest, the
   
    HAVING
   
   clause can be used, much like a
   
    WHERE
   
   clause, to eliminate groups from the result.
    The syntax is:
  
SELECTselect_listFROM ... [WHERE ...] GROUP BY ... HAVINGboolean_expression
   Expressions in the
   
    HAVING
   
   clause can refer both to
    grouped expressions and to ungrouped expressions (which necessarily
    involve an aggregate function).
  
Example:
=>SELECT x, sum(y) FROM test1 GROUP BY x HAVING sum(y) > 3;x | sum ---+----- a | 4 b | 5 (2 rows)=>SELECT x, sum(y) FROM test1 GROUP BY x HAVING x < 'c';x | sum ---+----- a | 4 b | 5 (2 rows)
Again, a more realistic example:
SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
    FROM products p LEFT JOIN sales s USING (product_id)
    WHERE s.date > CURRENT_DATE - INTERVAL '4 weeks'
    GROUP BY product_id, p.name, p.price, p.cost
    HAVING sum(p.price * s.units) > 5000;
  
   In the example above, the
   
    WHERE
   
   clause is selecting
    rows by a column that is not grouped (the expression is only true for
    sales during the last four weeks), while the
   
    HAVING
   
   clause restricts the output to groups with total gross sales over
    5000.  Note that the aggregate expressions do not necessarily need
    to be the same in all parts of the query.
  
   If a query contains aggregate function calls, but no
   
    GROUP BY
   
   clause, grouping still occurs: the result is a single group row (or
    perhaps no rows at all, if the single row is then eliminated by
   
    HAVING
   
   ).
    The same is true if it contains a
   
    HAVING
   
   clause, even
    without any aggregate function calls or
   
    GROUP BY
   
   clause.
  
      7.2.4. 
      
       GROUPING SETS
      
      ,
      
       CUBE
      
      , and
      
       ROLLUP
      
     
    
   More complex grouping operations than those described above are possible
    using the concept of
   
    grouping sets
   
   .  The data selected by
    the
   
    FROM
   
   and
   
    WHERE
   
   clauses is grouped separately
    by each specified grouping set, aggregates computed for each group just as
    for simple
   
    GROUP BY
   
   clauses, and then the results returned.
    For example:
  
=>SELECT * FROM items_sold;brand | size | sales -------+------+------- Foo | L | 10 Foo | M | 20 Bar | M | 15 Bar | L | 5 (4 rows)=>SELECT brand, size, sum(sales) FROM items_sold GROUP BY GROUPING SETS ((brand), (size), ());brand | size | sum -------+------+----- Foo | | 30 Bar | | 20 | L | 15 | M | 35 | | 50 (5 rows)
   Each sublist of
   
    GROUPING SETS
   
   may specify zero or more columns
    or expressions and is interpreted the same way as though it were directly
    in the
   
    GROUP BY
   
   clause.  An empty grouping set means that all
    rows are aggregated down to a single group (which is output even if no
    input rows were present), as described above for the case of aggregate
    functions with no
   
    GROUP BY
   
   clause.
  
References to the grouping columns or expressions are replaced by null values in result rows for grouping sets in which those columns do not appear. To distinguish which grouping a particular output row resulted from, see Table 9.59 .
A shorthand notation is provided for specifying two common types of grouping set. A clause of the form
ROLLUP (e1,e2,e3, ... )
represents the given list of expressions and all prefixes of the list including the empty list; thus it is equivalent to
GROUPING SETS (
    ( e1, e2, e3, ... ),
    ...
    ( e1, e2 ),
    ( e1 ),
    ( )
)
  This is commonly used for analysis over hierarchical data; e.g., total salary by department, division, and company-wide total.
A clause of the form
CUBE (e1,e2, ... )
represents the given list and all of its possible subsets (i.e., the power set). Thus
CUBE ( a, b, c )
is equivalent to
GROUPING SETS (
    ( a, b, c ),
    ( a, b    ),
    ( a,    c ),
    ( a       ),
    (    b, c ),
    (    b    ),
    (       c ),
    (         )
)
  
   The individual elements of a
   
    CUBE
   
   or
   
    ROLLUP
   
   clause may be either individual expressions, or sublists of elements in
    parentheses.  In the latter case, the sublists are treated as single
    units for the purposes of generating the individual grouping sets.
    For example:
  
CUBE ( (a, b), (c, d) )
is equivalent to
GROUPING SETS (
    ( a, b, c, d ),
    ( a, b       ),
    (       c, d ),
    (            )
)
  and
ROLLUP ( a, (b, c), d )
is equivalent to
GROUPING SETS (
    ( a, b, c, d ),
    ( a, b, c    ),
    ( a          ),
    (            )
)
  
   The
   
    CUBE
   
   and
   
    ROLLUP
   
   constructs can be used either
    directly in the
   
    GROUP BY
   
   clause, or nested inside a
   
    GROUPING SETS
   
   clause.  If one
   
    GROUPING SETS
   
   clause
    is nested inside another, the effect is the same as if all the elements of
    the inner clause had been written directly in the outer clause.
  
   If multiple grouping items are specified in a single
   
    GROUP BY
   
   clause, then the final list of grouping sets is the cross product of the
    individual items.  For example:
  
GROUP BY a, CUBE (b, c), GROUPING SETS ((d), (e))
is equivalent to
GROUP BY GROUPING SETS (
    (a, b, c, d), (a, b, c, e),
    (a, b, d),    (a, b, e),
    (a, c, d),    (a, c, e),
    (a, d),       (a, e)
)
  
Note
    The construct
    
     (a, b)
    
    is normally recognized in expressions as
    a
    
     row constructor
    
    .
    Within the
    
     GROUP BY
    
    clause, this does not apply at the top
    levels of expressions, and
    
     (a, b)
    
    is parsed as a list of
    expressions as described above.  If for some reason you
    
     
      need
     
    
    a row constructor in a grouping expression, use
    
     ROW(a, b)
    
    .
   
7.2.5. Window Function Processing
   If the query contains any window functions (see
   
    Section 3.5
   
   ,
   
    Section 9.22
   
   and
   
    Section 4.2.8
   
   ), these functions are evaluated
    after any grouping, aggregation, and
   
    HAVING
   
   filtering is
    performed.  That is, if the query uses any aggregates,
   
    GROUP
    BY
   
   , or
   
    HAVING
   
   , then the rows seen by the window functions
    are the group rows instead of the original table rows from
   
    FROM
   
   /
   
    WHERE
   
   .
  
   When multiple window functions are used, all the window functions having
    syntactically equivalent
   
    PARTITION BY
   
   and
   
    ORDER BY
   
   clauses in their window definitions are guaranteed to be evaluated in a
    single pass over the data. Therefore they will see the same sort ordering,
    even if the
   
    ORDER BY
   
   does not uniquely determine an ordering.
    However, no guarantees are made about the evaluation of functions having
    different
   
    PARTITION BY
   
   or
   
    ORDER BY
   
   specifications.
    (In such cases a sort step is typically required between the passes of
    window function evaluations, and the sort is not guaranteed to preserve
    ordering of rows that its
   
    ORDER BY
   
   sees as equivalent.)
  
   Currently, window functions always require presorted data, and so the
    query output will be ordered according to one or another of the window
    functions'
   
    PARTITION BY
   
   /
   
    ORDER BY
   
   clauses.
    It is not recommended to rely on this, however.  Use an explicit
    top-level
   
    ORDER BY
   
   clause if you want to be sure the
    results are sorted in a particular way.