Home > APIs > YSQL > The SQL language > SQL statements >

SELECT

Synopsis

Use the SELECT statement to retrieve rows of specified columns that meet a given condition from a table. It specifies the columns to be retrieved, the name of the table, and the condition each selected row must satisfy.

The same syntax rules govern a subquery, wherever you might use one—like, for example, in an INSERT statement. Certain syntax spots, for example a WHERE clause predicate or the actual argument of a function like sqrt(), allow only a scalar subquery.

select ::= [ WITH [ RECURSIVE ] { common_table_expression [ , ... ] } ] 
            SELECT [ ALL | 
                     DISTINCT [ ON { ( expression [ , ... ] ) } ] ] 
           [ * | { { expression
                     | fn_over_window
                     | ordinary_aggregate_fn_invocation
                     | within_group_aggregate_fn_invocation } 
                 [ [ AS ] name ] } [ , ... ] ]  
           [ FROM { from_item [ , ... ] } ]  
           [ WHERE boolean_expression ]  
           [ GROUP BY { grouping_element [ , ... ] } ]  
           [ HAVING boolean_expression ]  
           [ WINDOW { { name AS window_definition } [ , ... ] } ]  
           [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ] 
            [ ORDER BY { order_expr [ , ... ] } ]  
           [ LIMIT { integer | ALL } ]  
           [ OFFSET integer [ ROW | ROWS ] ]  
           [ FETCH { FIRST | NEXT } integer { ROW | ROWS } ONLY ]

common_table_expression ::= name [ ( name [ , ... ] ) ] AS ( 
                            { select
                              | values
                              | insert
                              | update
                              | delete } )

fn_over_window ::= name  ( [ expression [ , ... ] | * ]  
                   [ FILTER ( WHERE boolean_expression ) ] OVER 
                   { window_definition | name }

ordinary_aggregate_fn_invocation ::= name  ( 
                                     { [ ALL | DISTINCT ] expression 
                                       [ , ... ]
                                       | * } 
                                     [ ORDER BY order_expr [ , ... ] ] 
                                     )  [ FILTER ( WHERE 
                                          boolean_expression ) ]

within_group_aggregate_fn_invocation ::= name  ( 
                                         { expression [ , ... ] } )  
                                         WITHIN GROUP ( ORDER BY 
                                         order_expr [ , ... ] )  
                                         [ FILTER ( WHERE 
                                           boolean_expression ) ]

grouping_element ::= ( ) | ( expression [ , ... ] )
                     | ROLLUP ( expression [ , ... ] )
                     | CUBE ( expression [ , ... ] )
                     | GROUPING SETS ( grouping_element [ , ... ] )

order_expr ::= expression [ ASC | DESC | USING operator_name ] 
               [ NULLS { FIRST | LAST } ]

select

common_table_expression

fn_over_window

ordinary_aggregate_fn_invocation

within_group_aggregate_fn_invocation

grouping_element

order_expr

See the section The WITH clause and common table expressions for more information about the semantics of the common_table_expression grammar rule.

Semantics

An error is raised if the specified table_name does not exist.
* represents all columns.

While the where clause allows a wide range of operators, the exact conditions used in the where clause have significant performance considerations (especially for large datasets).

For details on from_item and with_query see SELECT in the PostgreSQL documentation.

The fn_over_window rule denotes the special kind of SELECT list item that must be used to invoke a window function and that may be used to invoke an aggregate function. (Window functions are known as analytic functions in the terminology of some SQL database systems.) The dedicated diagram that follows the main diagram for the select rule shows the FILTER and the OVER keywords. You can see that you cannot invoke a function in this way without specifying an OVER clause—and that the OVER clause requires the specification of the so-called window that gives this invocation style its name. The FILTER clause is optional and may be used only when you invoke an aggregate function in this way. All of this is explained in the Window function invocation—SQL syntax and semantics section within the major section Window functions.

The ordinary_aggregate_fn_invocation rule and the within_group_aggregate_fn_invocation rule denote the special kinds of SELECT list item that are used to invoke an aggregate function (when it isn't invoked as a window function). When an aggregate function is invoked in either of these two ways, it's very common to do so in conjunction with the GROUP BY and HAVING clauses. All of this is explained in the Aggregate function invocation—SQL syntax and semantics section within the major section Aggregate functions.

When you understand the story of the invocation of these two kinds of functions from the accounts in the Window functions section and the Aggregate functions section, you can use the \df metacommand in ysqlsh to discover the status of a particular function, thus:

\df row_number

... |          Argument data types           |  Type
... +----------------------------------------+--------
... |                                        | window


\df rank

... |          Argument data types           |  Type
... +----------------------------------------+--------
... |                                        | window
... | VARIADIC "any" ORDER BY VARIADIC "any" | agg


\df avg

... |          Argument data types           |  Type
... +----------------------------------------+--------
... | bigint                                 | agg
... | <other data types>                     | agg

A function whose type is listed as "window" can be invoked only as a window function. See this account of row_number().
A function whose type is listed both as "window" and as agg can be invoked:
- either as a window function using the fn_over_window syntax—see this account of rank()
- or as a so-called within-group hypothetical-set aggregate function using the within_group_aggregate_fn_invocation syntax—see this account of rank().
A function whose type is listed only as "agg" can, in fact, be invoked either as an aggregate function using the ordinary_aggregate_fn_invocation syntax or as a window function using the fn_over_window syntax. The avg() function is described in the "Aggregate functions" major section in the subsection avg(), count(), max(), min(), sum(). See its subsections GROUP BY syntax and OVER syntax for, respectively, the ordinary_aggregate_fn_invocation and the fn_over_window invocation alternatives.
Notice that the three functions mode(), percentile_disc(), and percentile_cont() are exceptions to this general rule (and they are the only exceptions). These functions are referred to as within-group ordered-set aggregate functions. \df lists the type of these functions only as "agg". But these cannot be invoked as window functions. The attempt causes this error:
```
42809: WITHIN GROUP is required for ordered-set aggregate mode
```

Note: The documentation in the Aggregate functions major section usually refers to the syntax that the ordinary_aggregate_fn_invocation rule and the within_group_aggregate_fn_invocation rule jointly govern as the GROUP BY syntax because it's these two syntax variants (and only these two) can be used together with the GROUP BY clause (and therefore the HAVING clause). And it usually refers to the syntax that the fn_over_window rule governs as the OVER syntax because this syntax variant requires the use of the OVER clause. Moreover, the use of the GROUP BY clause (and therefore the HAVING clause) is illegal with this syntax variant.

Examples

Create two sample tables.

yugabyte=# CREATE TABLE sample1(k1 bigint, k2 float, v text, PRIMARY KEY (k1, k2));

yugabyte=# CREATE TABLE sample2(k1 bigint, k2 float, v text, PRIMARY KEY (k1, k2));

Insert some rows.

yugabyte=# INSERT INTO sample1(k1, k2, v) VALUES (1, 2.5, 'abc'), (1, 3.5, 'def'), (1, 4.5, 'xyz');

yugabyte=# INSERT INTO sample2(k1, k2, v) VALUES (1, 2.5, 'foo'), (1, 4.5, 'bar');

Select from both tables using join.

yugabyte=# SELECT a.k1, a.k2, a.v as av, b.v as bv FROM sample1 a LEFT JOIN sample2 b ON (a.k1 = b.k1 and a.k2 = b.k2) WHERE a.k1 = 1 AND a.k2 IN (2.5, 3.5) ORDER BY a.k2 DESC;

 k1 | k2  | av  | bv
----+-----+-----+-----
  1 | 3.5 | def |
  1 | 2.5 | abc | foo
(2 rows)