CREATE INDEX
Synopsis
Use the CREATE INDEX
statement to create an index on the specified columns of the specified table. Indexes are primarily used to improve query performance.
Syntax
create_index ::= CREATE [ UNIQUE ] INDEX [ NONCONCURRENTLY ]
[ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name
[ USING access_method_name ] ( index_elem [ , ... ] )
[ INCLUDE ( column_name [ , ... ] ) ]
[ WHERE boolean_expression ]
index_elem ::= { column_name | ( expression ) }
[ operator_class_name ] [ HASH | ASC | DESC ]
[ NULLS { FIRST | LAST } ]
create_index
index_elem
Semantics
CONCURRENTLY
, COLLATE
, and TABLESPACE
options are not yet supported.
When an index is created on a populated table, YugabyteDB automatically backfills the existing data into the index. In most cases, this uses an online schema migration. The following table explains some of the differences between creating an index online and not online.
Condition | Online | Not online |
---|---|---|
Safe to do other DMLs during CREATE INDEX ? |
yes | no |
Keeps other transactions alive during CREATE INDEX ? |
mostly | no |
Parallelizes index loading? | yes | no |
To disable online schema migration for YSQL CREATE INDEX
, set the flag ysql_disable_index_backfill=true
on all nodes and both master and tserver.
To disable online schema migration for one CREATE INDEX
, use CREATE INDEX NONCONCURRENTLY
.
Note
For details on how online index backfill works, refer to Online Index Backfill.Regarding colocation, indexes follow their table. If the table is colocated, its index is also colocated; if the table is not colocated, its index is also not colocated.
UNIQUE
Enforce that duplicate values in a table are not allowed.
NONCONCURRENTLY
Disable online schema migration (see Semantics for details).
Concurrent is the default, but the grammar does not yet support CONCURRENTLY
.
access_method_name
The name of the index access method.
By default, lsm
is used for YugabyteDB tables and btree
is used otherwise (for example, temporary tables).
GIN indexes can be created in YugabyteDB by using the ybgin
access method.
INCLUDE clause
Specify a list of columns which will be included in the index as non-key columns.
WHERE clause
A partial index is an index that is built on a subset of a table and includes only rows that satisfy the condition specified in the WHERE
clause.
It can be used to exclude NULL or common values from the index, or include just the rows of interest.
This will speed up any writes to the table since rows containing the common column values don't need to be indexed.
It will also reduce the size of the index, thereby improving the speed for read queries that use the index.
name
Specify the name of the index to be created.
table_name
Specify the name of the table to be indexed.
index_elem
column_name
Specify the name of a column of the table.
expression
Specify one or more columns of the table and must be surrounded by parentheses.
HASH
- Use hash of the column. This is the default option for the first column and is used to shard the index table.ASC
— Sort in ascending order. This is the default option for second and subsequent columns of the index.DESC
— Sort in descending order.NULLS FIRST
- Specifies that nulls sort before non-nulls. This is the default when DESC is specified.NULLS LAST
- Specifies that nulls sort after non-nulls. This is the default when DESC is not specified.
SPLIT INTO
For hash-sharded indexes, you can use the SPLIT INTO
clause to specify the number of tablets to be created for the index. The hash range is then evenly split across those tablets.
Presplitting indexes, using SPLIT INTO
, distributes index workloads on a production cluster. For example, if you have 3 servers, splitting the index into 30 tablets can provide higher write throughput on the index. For an example, see Create an index specifying the number of tablets.
Note
By default, YugabyteDB presplits an index intoysql_num_shards_per_tserver * num_of_tserver
tablets. The SPLIT INTO
clause can be used to override that setting on a per-index basis.
Examples
Unique index with HASH column ordering
Create a unique index with hash ordered columns.
yugabyte=# CREATE TABLE products(id int PRIMARY KEY,
name text,
code text);
yugabyte=# CREATE UNIQUE INDEX ON products(code);
yugabyte=# \d products
Table "public.products"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
name | text | | |
code | text | | |
Indexes:
"products_pkey" PRIMARY KEY, lsm (id HASH)
"products_code_idx" UNIQUE, lsm (code HASH)
ASC ordered index
Create an index with ascending ordered key.
yugabyte=# CREATE INDEX products_name ON products(name ASC);
yugabyte=# \d products_name
Index "public.products_name"
Column | Type | Key? | Definition
--------+------+------+------------
name | text | yes | name
lsm, for table "public.products
INCLUDE columns
Create an index with ascending ordered key and include other columns as non-key columns
yugabyte=# CREATE INDEX products_name_code ON products(name) INCLUDE (code);
yugabyte=# \d products_name_code;
Index "public.products_name_code"
Column | Type | Key? | Definition
--------+------+------+------------
name | text | yes | name
code | text | no | code
lsm, for table "public.products"
Create an index specifying the number of tablets
To specify the number of tablets for an index, you can use the CREATE INDEX
statement with the SPLIT INTO
clause.
CREATE TABLE employees (id int PRIMARY KEY, first_name TEXT, last_name TEXT) SPLIT INTO 10 TABLETS;
CREATE INDEX ON employees(first_name, last_name) SPLIT INTO 10 TABLETS;
Partial indexes
Consider an application maintaining shipments information. It has a shipments
table with a column for delivery_status
. If the application needs to access in-flight shipments frequently, then it can use a partial index to exclude rows whose shipment status is delivered
.
yugabyte=# create table shipments(id int, delivery_status text, address text, delivery_date date);
yugabyte=# create index shipment_delivery on shipments(delivery_status, address, delivery_date) where delivery_status != 'delivered';
Troubleshooting
If the following troubleshooting tips don't resolve your issue, please ask for help in our community Slack or file a GitHub issue.
If online CREATE INDEX
fails, it likely failed in the backfill step.
In that case, the index exists but is not usable.
Drop the index and try again.
If it still doesn't work, here are some troubleshooting steps:
- Did it time out? Try increasing timeout flags:
- master
ysql_index_backfill_rpc_timeout_ms
- tserver
backfill_index_client_rpc_timeout_ms
- master
- Did you get a "backfill failed to connect to DB" error? You may be hitting an issue with authentication. If you're on a stable version prior to 2.4 or a latest (2.3.x or 2.5.x) version prior to 2.5.2, online
CREATE INDEX
does not work with authentication enabled.- For version 2.5.1, you can use
CREATE INDEX NONCONCURRENTLY
as a workaround. - If the version is at least 2.3, you can set
ysql_disable_index_backfill=false
as a workaround. - In all supported versions, you can disable authentication (for example, by using
ysql_enable_auth
,ysql_hba_conf
, orysql_hba_conf_csv
) as a workaround.
- For version 2.5.1, you can use
- Did you get a "duplicate key value" error? Then, you have a unique constraint violation.
To prioritize keeping other transactions alive during the index backfill, bump up the following:
- master flag
index_backfill_wait_for_old_txns_ms
- YSQL GUC variable
yb_index_state_flags_update_delay
To speed up index creation by a few seconds when you know there will be no online writes, set the YSQL GUC variable yb_index_state_flags_update_delay
to zero.