CREATE TABLE
Synopsis
Use the CREATE TABLE
statement to create a new table in a keyspace. It defines the table name, column names and types, primary key, and table properties.
Syntax
Diagram
create_table
table_schema
table_properties
Grammar
create_table ::= CREATE TABLE [ IF NOT EXISTS ] table_name
'(' table_element [ ',' table_element ...] ')'
[WITH table_properties];
table_element ::= table_column | table_constraints
table_column ::= column_name column_type [ column_constraint ...]
column_constraint ::= PRIMARY KEY | STATIC
table_constraints ::= PRIMARY KEY '(' partition_key_column_list clustering_key_column_list ')'
partition_key_column_list ::= '(' column_name [ ',' column_name ...] ')' | column_name
clustering_key_column_list ::= [ ',' column_name ...]
table_properties = [table_options]
[[AND] CLUSTERING ORDER BY '(' column_ordering_property [ ',' column_ordering_property ...] ')']
[[AND] COMPACT STORAGE]
table_options = property_name '=' property_literal [AND property_name '=' property_literal ...]
column_ordering_property ::= column_name [ ASC | DESC ]
Where
table_name
,column_name
, andproperty_name
are identifiers (table_name
may be qualified with a keyspace name).property_literal
is a literal of either boolean, text, or map data type.
Semantics
- An error is raised if
table_name
already exists in the associated keyspace unless theIF NOT EXISTS
option is used.
PRIMARY KEY
- Primary key must be defined in either
column_constraint
ortable_constraint
but not in both of them. - Each row in a table is uniquely identified by its primary key.
- Primary key columns are either partitioning columns or clustering columns (described below).
- If primary key is set as a column constraint, then that column is the partition column and there are no clustering columns.
- If primary key is set as a table constraint then:
- The partition columns are given by the first entry in the primary key list: the nested column list (if given), otherwise the first column.
- The clustering columns are the rest of the columns in the primary key list (if any).
PARTITION KEY
- Partition key is required and defines a split of rows into partitions.
- Rows that share the same partition key form a partition and will be colocated on the same replica node.
CLUSTERING KEY
- Clustering key is optional and defines an ordering for rows within a partition.
- Default ordering is ascending (
ASC
) but can be set for each clustering column as ascending or descending using theCLUSTERING ORDER BY
table property.
STATIC COLUMNS
- Columns declared as
STATIC
will share the same value for all rows within a partition (i.e. rows having the same partition key). - Columns in the primary key cannot be static.
- A table without clustering columns cannot have static columns (without clustering columns the primary key and the partition key are identical so static columns would be the same as regular columns).
table_properties
- The
CLUSTERING ORDER BY
property can be used to set the ordering for each clustering column individually (default isASC
). - The
default_time_to_live
property sets the default expiration time (TTL) in seconds for a table. The expiration time can be overridden by setting TTL for individual rows. The default value is0
and means rows do not expire. - The
transactions
property specifies if distributed transactions are enabled in the table. To enable distributed transactions, usetransactions = { 'enabled' : true }
. - Use the
AND
operator to use multiple table properties. - The other YCQL table properties are allowed in the syntax but are currently ignored internally (have no effect).
- The
TABLETS = <num>
property specifies the number of tablets to be used for the specified YCQL table. Setting this property overrides the value from the--yb_num_shards_per_tserver
option. For an example, see Create a table specifying the number of tablets. COMPACT STORAGE
is only for syntax compatibility with Cassandra. It doesn't affect the underlying storage.
Examples
Use column constraint to define primary key
'user_id' is the partitioning column and there are no clustering columns.
ycqlsh:example> CREATE TABLE users(user_id INT PRIMARY KEY, full_name TEXT);
Use table constraint to define primary key
'supplier_id' and 'device_id' are the partitioning columns and 'model_year' is the clustering column.
ycqlsh:example> CREATE TABLE devices(supplier_id INT,
device_id INT,
model_year INT,
device_name TEXT,
PRIMARY KEY((supplier_id, device_id), model_year));
Use column constraint to define a static column
You can do this as shown below.
ycqlsh:example> CREATE TABLE items(supplier_id INT,
item_id INT,
supplier_name TEXT STATIC,
item_name TEXT,
PRIMARY KEY((supplier_id), item_id));
ycqlsh:example> INSERT INTO items(supplier_id, item_id, supplier_name, item_name)
VALUES (1, 1, 'Unknown', 'Wrought Anvil');
ycqlsh:example> INSERT INTO items(supplier_id, item_id, supplier_name, item_name)
VALUES (1, 2, 'Acme Corporation', 'Giant Rubber Band');
ycqlsh:example> SELECT * FROM devices;
supplier_id | item_id | supplier_name | item_name
-------------+---------+------------------+-------------------
1 | 1 | Acme Corporation | Wrought Anvil
1 | 2 | Acme Corporation | Giant Rubber Band
Use table property to define the order (ascending or descending) for clustering columns
Timestamp column 'ts' will be stored in descending order (latest values first).
ycqlsh:example> CREATE TABLE user_actions(user_id INT,
ts TIMESTAMP,
action TEXT,
PRIMARY KEY((user_id), ts))
WITH CLUSTERING ORDER BY (ts DESC);
ycqlsh:example> INSERT INTO user_actions(user_id, ts, action) VALUES (1, '2000-12-2 12:30:15', 'log in');
ycqlsh:example> INSERT INTO user_actions(user_id, ts, action) VALUES (1, '2000-12-2 12:30:25', 'change password');
ycqlsh:example> INSERT INTO user_actions(user_id, ts, action) VALUES (1, '2000-12-2 12:30:35', 'log out');
ycqlsh:example> SELECT * FROM user_actions;
user_id | ts | action
---------+---------------------------------+-----------------
1 | 2000-12-02 19:30:35.000000+0000 | log out
1 | 2000-12-02 19:30:25.000000+0000 | change password
1 | 2000-12-02 19:30:15.000000+0000 | log in
Use table property to define the default expiration time for rows
You can do this as shown below.
ycqlsh:example> CREATE TABLE sensor_data(sensor_id INT,
ts TIMESTAMP,
value DOUBLE,
PRIMARY KEY((sensor_id), ts))
WITH default_time_to_live = 5;
First insert at time T (row expires at T + 5).
ycqlsh:example> INSERT INTO sensor_data(sensor_id, ts, value) VALUES (1, '2017-10-1 11:22:31', 3.1);
Second insert 3 seconds later (row expires at T + 8).
ycqlsh:example> INSERT INTO sensor_data(sensor_id, ts, value) VALUES (2, '2017-10-1 11:22:34', 3.4);
First select 3 seconds later (at time T + 6).
ycqlsh:example> SELECT * FROM sensor_data;
sensor_id | ts | value
-----------+---------------------------------+-------
2 | 2017-10-01 18:22:34.000000+0000 | 3.4
Second select 3 seconds later (at time T + 9).
ycqlsh:example> SELECT * FROM sensor_data;
sensor_id | ts | value
-----------+----+-------
Create a table specifying the number of tablets
You can use the CREATE TABLE
statement with the WITH tablets = <num>
clause to specify the number of tablets for a table. This is useful to scale the table up or down based on requirements. For example, for smaller static tables, it may be wasteful to have a large number of shards (tablets). In that case, you can use this to reduce the number of tablets created for the table. Similarly, for a very large table, you can use this statement to presplit the table into a large number of shards to get improved performance.
Note that YugabyteDB, by default, presplits a table in yb_num_shards_per_tserver * num_of_tserver
shards. This clause can be used to override that setting on per-table basis.
ycqlsh:example> CREATE TABLE tracking (id int PRIMARY KEY) WITH tablets = 10;
If you create an index for these tables, you can also specify the number of tablets for the index.
You can also use AND
to add other table properties, like in this example.
ycqlsh:example> CREATE TABLE tracking (id int PRIMARY KEY) WITH tablets = 10 AND transactions = { 'enabled' : true };