Home > Explore > Multi-region deployments >

Row-Level Geo-Partitioning

YSQL

Row-level geo-partitioning allows fine-grained control over pinning data in a user table (at a per-row level) to geographic locations, thereby allowing the data residency to be managed at the table-row level. Use-cases requiring low latency multi-region deployments, transactional consistency semantics and transparent schema change propagation across the regions would benefit from this feature.

Moving data closer to users

Geo-partitioning makes it easy for developers to move data closer to users for:

Achieving lower latency and higher performance
Meeting data residency requirements to comply with regulations such as GDPR

Geo-partitioning of data enables fine-grained, row-level control over the placement of table data across different geographical locations. This is accomplished in three simple steps – first, creating local transaction status tables within each region; second, partitioning a table into user-defined table partitions; and finally, pinning these partitions to the desired geographic locations by configuring metadata for each partition.

The first step of creating local transaction tables within each region is done by creating a new transaction table and setting its placement.
The second step of creating user-defined table partitions is done by designating a column of the table as the partition column that will be used to geo-partition the data. The value of this column for a given row is used to determine the table partition that the row belongs to.
The third step involves creating partitions in the respective geographic locations using tablespaces. Note that the data in each partition can be configured to get replicated across multiple zones in a cloud provider region, or across multiple nearby regions / datacenters.

An entirely new geographic partition can be introduced dynamically by adding a new table partition and configuring it to keep the data resident in the desired geographic location. Data in one or more of the existing geographic locations can be purged efficiently simply by dropping the necessary partitions. Users of traditional RDBMS would recognize this scheme as being close to user-defined list-based table partitions, with the ability to control the geographic location of each partition.

In this deployment, users can access their data with low latencies because the data resides on servers that are geographically close by, and the queries do not need to access data in far away geographic locations.

This tutorial explains this feature in the context of an example scenario described in the next section.

Example scenario

Let us look at this feature in the context of a use case. Say that a large but imaginary bank, Yuga Bank, wants to offer an online banking service to users in many countries by processing their deposits, withdrawals, and transfers.

The following attributes would be required in order to build such a service.

Transactional semantics with high availability: Consistency of data is paramount in a banking application, hence the database should be ACID compliant. Additionally, users expect the service to always be available, making high availability and resilience to failures a critical requirement.
High performance: The online transactions need to be processed with a low latency in order to ensure a good end-user experience. This requires that the data for a particular user is located in a nearby geographic region. Putting all the data in a single location in an RDBMS would mean the requests for users residing far away from that location would have very high latencies, leading to a poor user experience.
Data residency requirements for compliance: Many countries have regulations around which geographic regions the personal data of their residents can be stored in, and bank transactions being personal data are subject to these requirements. For example, GDPR has a data residency stipulation which effectively requires that the personal data of individuals in the EU be stored in the EU. Similarly, India has a requirement issued by the Reserve Bank of India (or RBI for short) making it mandatory for all banks, intermediaries, and other third parties to store all information pertaining to payments data in India – though in case of international transactions, the data on the foreign leg of the transaction can be stored in foreign locations.

Note

While this scenario has regulatory compliance requirements where data needs to be resident in certain geographies, the exact same technique applies for the goal of moving data closer to users in order to achieve low latency and high performance. Hence, high performance is listed above as a requirement.

Step 1. Create tablespaces

First, we create tablespaces for each geographic region we wish to partition data into:

Create tablespaces for each region.

CREATE TABLESPACE eu_central_1_tablespace WITH (
  replica_placement='{"num_replicas": 3, "placement_blocks":
  [{"cloud":"aws","region":"eu-central-1","zone":"eu-central-1a","min_num_replicas":1},
  {"cloud":"aws","region":"eu-central-1","zone":"eu-central-1b","min_num_replicas":1},
  {"cloud":"aws","region":"eu-central-1","zone":"eu-central-1c","min_num_replicas":1}]}'
);

CREATE TABLESPACE us_west_2_tablespace WITH (
  replica_placement='{"num_replicas": 3, "placement_blocks":
  [{"cloud":"aws","region":"us-west-2","zone":"us-west-2a","min_num_replicas":1},
  {"cloud":"aws","region":"us-west-2","zone":"us-west-2b","min_num_replicas":1},
  {"cloud":"aws","region":"us-west-2","zone":"us-west-2c","min_num_replicas":1}]}'
);

CREATE TABLESPACE ap_south_1_tablespace WITH (
  replica_placement='{"num_replicas": 3, "placement_blocks":
  [{"cloud":"aws","region":"ap-south-1","zone":"ap-south-1a","min_num_replicas":1},
  {"cloud":"aws","region":"ap-south-1","zone":"ap-south-1b","min_num_replicas":1},
  {"cloud":"aws","region":"ap-south-1","zone":"ap-south-1c","min_num_replicas":1}]}'
);

Step 2. Create table with partitions

Next, we create the parent table that contains a geo_partition column which is used to create list-based partitions for each geographic region we want to partition data into as shown in the following diagram:

Row-level geo-partitioning

Create the parent table.

CREATE TABLE bank_transactions (
    user_id   INTEGER NOT NULL,
    account_id INTEGER NOT NULL,
    geo_partition VARCHAR,
    account_type VARCHAR NOT NULL,
    amount NUMERIC NOT NULL,
    txn_type VARCHAR NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
) PARTITION BY LIST (geo_partition);

Next, create one partition per desired geography under the parent table, and assign each to the applicable tablespace. Here, you create three table partitions: one for the EU region called bank_transactions_eu, another for the India region called bank_transactions_india, and a third partition for US region called bank_transactions_us.

CREATE TABLE bank_transactions_eu
    PARTITION OF bank_transactions
      (user_id, account_id, geo_partition, account_type,
      amount, txn_type, created_at,
      PRIMARY KEY (user_id HASH, account_id, geo_partition))
    FOR VALUES IN ('EU') TABLESPACE eu_central_1_tablespace;

CREATE TABLE bank_transactions_india
    PARTITION OF bank_transactions
      (user_id, account_id, geo_partition, account_type,
      amount, txn_type, created_at,
      PRIMARY KEY (user_id HASH, account_id, geo_partition))
    FOR VALUES IN ('India') TABLESPACE ap_south_1_tablespace;

CREATE TABLE bank_transactions_us
    PARTITION OF bank_transactions
      (user_id, account_id, geo_partition, account_type,
      amount, txn_type, created_at,
      PRIMARY KEY (user_id HASH, account_id, geo_partition))
    FOR VALUES IN ('US') TABLESPACE us_west_2_tablespace;

Use the \d command to view the table and partitions you've created so far.

yugabyte=# \d

                List of relations
 Schema |         Name                      | Type  |  Owner
--------+-----------------------------------+-------+----------
 public | bank_transactions         | table | yugabyte
 public | bank_transactions_eu      | table | yugabyte
 public | bank_transactions_india   | table | yugabyte
 public | bank_transactions_us      | table | yugabyte
(4 rows)

The data is now arranged as follows:

Row-level geo-partitioning

Region-local transaction table

When you create tables using a tablespace with a placement set, YugabyteDB automatically creates a transaction table under the tablespace if one doesn't yet exist, with a name like system.transactions_90141438-f42c-4a39-8a12-4072c1216d46.

Step 3. Pinning user partitions specific to geographic locations

Now, the setup should automatically be able to pin rows to the appropriate regions based on the value set in the geo_partition column. This is shown in the following diagram:

Row-level geo-partitioning

You can test this region pinning by inserting a few rows of data and verifying they are written to the correct partitions.

Expanded output display

The sample output includes expanded auto mode output formatting for better readability. You can enable this mode with the following statement:

yugabyte=# \x auto

Expanded display is used automatically.

Insert a row into the table with the geo_partition column value set to EU below.

INSERT INTO bank_transactions
    VALUES (100, 10001, 'EU', 'checking', 120.50, 'debit');

Verify that the row is present in the bank_transactions table.

yugabyte=# select * from bank_transactions;

-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | EU
account_type  | checking
amount        | 120.5
txn_type      | debit
created_at    | 2020-11-07 21:28:11.056236

Additionally, the row must be present only in the bank_transactions_eu partition, which can be easily verified by running the select statement directly against that partition. The other partitions should contain no rows.

yugabyte=# select * from bank_transactions_eu;

-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | EU
account_type  | checking
amount        | 120.5
txn_type      | debit
created_at    | 2020-11-07 21:28:11.056236

yugabyte=# select count(*) from bank_transactions_india;

 count
-------
     0

yugabyte=# select count(*) from bank_transactions_us;

 count
-------
     0

Now, let's insert data into the other partitions.

INSERT INTO bank_transactions
    VALUES (200, 20001, 'India', 'savings', 1000, 'credit');
INSERT INTO bank_transactions
    VALUES (300, 30001, 'US', 'checking', 105.25, 'debit');

These can be verified as follows:

yugabyte=# select * from bank_transactions_india;

-[ RECORD 1 ]-+---------------------------
user_id       | 200
account_id    | 20001
geo_partition | India
account_type  | savings
amount        | 1000
txn_type      | credit
created_at    | 2020-11-07 21:45:26.011636

yugabyte=# select * from bank_transactions_us;

-[ RECORD 1 ]-+---------------------------
user_id       | 300
account_id    | 30001
geo_partition | US
account_type  | checking
amount        | 105.25
txn_type      | debit
created_at    | 2020-11-07 21:45:26.067444

Step 4. Users travelling across geographic locations

In order to make things interesting, let us say user 100, whose first bank transaction was performed in the EU region travels to India and the US, and performs two other bank transactions. This can be simulated by using the following statements.

INSERT INTO bank_transactions
    VALUES (100, 10001, 'India', 'savings', 2000, 'credit');
INSERT INTO bank_transactions
    VALUES (100, 10001, 'US', 'checking', 105, 'debit');

Now, each of the bank transactions would be pinned to the appropriate geographic locations. This can be verified as follows.

yugabyte=# select * from bank_transactions_india where user_id=100;

-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | India
account_type  | savings
amount        | 2000
txn_type      | credit
created_at    | 2020-11-07 21:56:26.760253

yugabyte=# select * from bank_transactions_us where user_id=100;

-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | US
account_type  | checking
amount        | 105
txn_type      | debit
created_at    | 2020-11-07 21:56:26.794173

All the bank transactions made by the user can efficiently be retrieved using the following SQL statement.

yugabyte=# select * from bank_transactions where user_id=100 order by created_at desc;

-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | US
account_type  | checking
amount        | 105
txn_type      | debit
created_at    | 2020-11-07 21:56:26.794173
-[ RECORD 2 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | India
account_type  | savings
amount        | 2000
txn_type      | credit
created_at    | 2020-11-07 21:56:26.760253
-[ RECORD 3 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | EU
account_type  | checking
amount        | 120.5
txn_type      | debit
created_at    | 2020-11-07 21:28:11.056236

Step 5. Running distributed transactions

So far, we have only been running SELECT and single-row transactions. Geo-partitioning introduces a new complication for general distributed transactions.

Let's say we want to run the following transaction:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10002, 'EU', 'checking', 400.00, 'debit');
INSERT INTO bank_transactions VALUES (100, 10003, 'EU', 'checking', 400.00, 'credit');
COMMIT;

If we attempt to run this while connected to a node in us-west-2, we get an error:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10002, 'EU', 'checking', 400.00, 'debit');

ERROR:  Illegal state: Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64: . Errors from tablet servers: [Illegal state (yb/client/transaction.cc:284): Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64]

Because we have a tablespace with placement set to the us-west-2 region, YugabyteDB assumes by default that we want to run a transaction local to that region (using the automatically-generated transaction status table for the us_west_2_tablespace created in Step 1), but such a transaction cannot modify data outside of us-west-2.

However, if we instead connect to a node in eu-central-1 and run the exact same transaction, we are now able to finish and commit the transaction without error:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10002, 'EU', 'checking', 400.0, 'debit');

INSERT 1 0

INSERT INTO bank_transactions VALUES (100, 10003, 'EU', 'checking', 400.0, 'credit');

INSERT 1 0

COMMIT;

COMMIT

Sometimes though, we might want to run a transaction that writes data to multiple regions, for example:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10004, 'US', 'checking', 400.00, 'debit');
INSERT INTO bank_transactions VALUES (200, 10005, 'EU', 'checking', 400.00, 'credit');
COMMIT;

Running this transaction will fail whether we run it from us-west-2 or eu-central-1. The solution is to mark the transaction as a global transaction:

SET force_global_transaction = TRUE;
BEGIN;
INSERT INTO bank_transactions VALUES (100, 10004, 'US', 'checking', 400.00, 'debit');

INSERT 1 0

INSERT INTO bank_transactions VALUES (200, 10005, 'EU', 'checking', 400.00, 'credit');

INSERT 1 0

COMMIT;

COMMIT

Setting force_global_transaction = TRUE tells YugabyteDB to use the system.transactions transaction table instead, which is presumed to be globally replicated, and lets us run distributed transactions that span multiple regions.

Global transaction latency

Only force global transactions when necessary. All distributed transactions can run without problems under force_global_transaction = TRUE, but you may have significantly higher latency when committing the transaction, because YugabyteDB must achieve consensus across multiple regions to write to system.transactions. Whenever possible, use the default setting of force_global_transaction = FALSE.

Finally, let's say we want to delete the row we just inserted. If we run the following query connected to eu-central-1 as a local transaction, the query once again errors out:

SET force_global_transaction = FALSE;
DELETE FROM bank_transactions WHERE user_id = 200 AND account_id = 10005;

ERROR:  Illegal state: Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64: . Errors from tablet servers: [Illegal state (yb/client/transaction.cc:284): Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64]

We are attempting to delete from the main table (bank_transactions rather than bank_transactions_eu_west_1) and not specifying the partition column (there's no geo_partition = 'EU' clause). This means that YugabyteDB is unable to tell that the row being deleted is in fact located in eu-central-1. To fix this, we could instead run:

DELETE FROM bank_transactions_eu_west_1 WHERE user_id = 200 AND account_id = 10005;

DELETE 1

Step 6. Adding a new geographic location

Assume that after a while, our fictitious Yuga Bank gets a lot of customers across the globe, and wants to offer the service to residents of Brazil, which also has data residency laws. Thanks to row-level geo-partitioning, this can be accomplished easily. We can simply add a new partition and pin it to the AWS South America (São Paulo) region sa-east-1 as shown below.

First, create the tablespace:

CREATE TABLESPACE sa_east_1_tablespace WITH (
    replica_placement='{"num_replicas": 3, "placement_blocks":
      [{"cloud":"aws","region":"sa-east-1","zone":"sa-east-1a","min_num_replicas":1},
      {"cloud":"aws","region":"sa-east-1","zone":"sa-east-1b","min_num_replicas":1},
      {"cloud":"aws","region":"sa-east-1","zone":"sa-east-1c","min_num_replicas":1}]}'
    );

Then, create the partition for Brazil:

CREATE TABLE bank_transactions_brazil
    PARTITION OF bank_transactions
      (user_id, account_id, geo_partition, account_type,
       amount, txn_type, created_at,
       PRIMARY KEY (user_id HASH, account_id, geo_partition))
    FOR VALUES IN ('Brazil') TABLESPACE sa_east_1_tablespace;

And with that, the new region is ready to store bank transactions of the residents of Brazil.

INSERT INTO bank_transactions
    VALUES (400, 40001, 'Brazil', 'savings', 1000, 'credit');

select * from bank_transactions_brazil;

-[ RECORD 1 ]-+-------------------------
user_id       | 400
account_id    | 40001
geo_partition | Brazil
account_type  | savings
amount        | 1000
txn_type      | credit
created_at    | 2020-11-07 22:09:04.8537