Join us on YugabyteDB Community Slack
Star us on
Get Started
Slack
GitHub
Get Started
v2.13 (latest) v2.12 (stable) v2.8 (earlier version) v2.6 (earlier version) v2.4 (earlier version) v2.2 (earlier version) v2.1 (earlier version) v2.0 (earlier version) v1.3 (earlier version)
  • YUGABYTEDB CORE
    • Quick start
      • 1. Install YugabyteDB
      • 2. Create a local cluster
      • 3. Explore distributed SQL
      • 4. Build an application
        • Java
        • Node.js
        • Go
        • Python
        • Ruby
        • C#
        • PHP
        • C++
        • C
        • Scala
    • Explore
      • SQL features
        • Schemas and Tables
        • Data Types
        • Data Manipulation
        • Queries and Joins
        • Expressions and Operators
        • Stored Procedures
        • Triggers
        • Advanced features
          • Cursors
          • Table Partitioning
          • Views
          • Savepoints
          • Collations
          • Extensions
        • Going beyond SQL
          • Follower reads
          • Tablespaces
      • Fault tolerance
      • Horizontal scalability
        • Scaling Transactions
        • Sharding Data
      • Transactions
        • Distributed Transactions
        • Isolation Levels
        • Explicit Locking
      • Indexes and Constraints
        • Overview
        • Unique Indexes
        • Partial Indexes
        • Expression Indexes
        • Generalized Inverted Indexes
        • Primary Key
        • Foreign Key
        • Other Constraints
      • JSON support
      • Multi-region deployments
        • Sync replication (3+ regions)
        • Async Replication (2+ regions)
        • Row-Level Geo-Partitioning
        • Read replicas
      • Query tuning
        • Introduction
        • Get query statistics using pg_stat_statements
        • Viewing live queries with pg_stat_activity
        • Analyzing queries with EXPLAIN
        • Optimizing YSQL queries using pg_hint_plan
      • Cluster management
        • Point-in-time recovery
      • Security
      • Observability
        • Prometheus Integration
        • Grafana Dashboard
    • Develop
      • Learn app development
        • 1. SQL vs NoSQL
        • 2. Data modeling
        • 3. Data types
        • 4. ACID transactions
        • 5. Aggregations
        • 6. Batch operations
        • 7. Date and time
        • 8. Strings and text
        • 9. TTL for data expiration
      • Real-world examples
        • E-Commerce app
        • IoT fleet management
      • Explore sample apps
      • Best practices
      • Cloud-native development
        • Codespaces
        • Gitpod
    • Migrate
      • Migration process overview
      • Migrate from PostgreSQL
        • Convert a PostgreSQL schema
        • Migrate a PostgreSQL application
        • Export PostgreSQL data
        • Prepare a cluster
        • Import PostgreSQL data
        • Verify Migration
    • Deploy
      • Deployment checklist
      • Manual deployment
        • 1. System configuration
        • 2. Install software
        • 3. Start YB-Masters
        • 4. Start YB-TServers
        • 5. Verify deployment
      • Kubernetes
        • Single-zone
          • Open Source
          • Amazon EKS
          • Google Kubernetes Engine
          • Azure Kubernetes Service
        • Multi-zone
          • Amazon EKS
          • Google Kubernetes Engine
        • Multi-cluster
          • Google Kubernetes Engine
        • Best practices
        • Connect Clients
      • Docker
      • Public clouds
        • Amazon Web Services
        • Google Cloud Platform
        • Microsoft Azure
      • Multi-DC deployments
        • Three+ data center (3DC)
        • Asynchronous Replication
        • Read replica clusters
    • Benchmark
      • TPC-C
      • sysbench
      • YCSB
      • Key-value workload
      • Large datasets
      • Scalability
        • Scaling queries
      • Resilience
        • Jepsen testing
      • Performance Troubleshooting
    • Secure
      • Security checklist
      • Enable Authentication
        • Enable User Authentication
        • Configure ysql_hba_conf_csv
      • Authentication Methods
        • Password Authentication
        • LDAP Authentication
        • Host-Based Authentication
        • Trust Authentication
      • Role-Based Access Control
        • Overview
        • Manage Users and Roles
        • Grant Privileges
        • Row-Level Security (RLS)
        • Column-Level Security
      • Encryption in Transit
        • Create server certificates
        • Enable server-to-server encryption
        • Enable client-to-server encryption
        • Connect to Clusters
        • TLS and authentication
      • Encryption at rest
      • Column-level encryption
      • Audit Logging
        • Configure Audit Logging
        • Session-Level Audit Logging
        • Object-Level Audit Logging
      • Vulnerability disclosure policy
    • Manage
      • Back up and restore
        • Back up data
        • Restore data
        • Point-in-time recovery
        • Snapshot and restore data
      • Migrate data
        • Bulk import
        • Bulk export
      • Change cluster configuration
      • Diagnostics reporting
      • Upgrade a deployment
      • Grow cluster
    • Troubleshoot
      • Troubleshooting
      • Cluster level issues
        • YCQL connection issues
        • YEDIS connection Issues
        • Recover tserver/master
        • Replace a failed YB-TServer
        • Replace a failed YB-Master
        • Manual remote bootstrap when a majority of peers fail
      • Node level issues
        • Check servers
        • Inspect logs
        • System statistics
        • Disk failure
        • Common error messages
    • Contribute
      • Core database
        • Contribution checklist
        • Build the source
        • Configure a CLion project
        • Run the tests
        • Coding style
  • YUGABYTE PLATFORM
    • Overview
      • Install
      • Configure
    • Install Yugabyte Platform
      • Prerequisites
      • Prepare the environment
      • Install software
      • Prepare nodes (on-prem)
      • Uninstall software
    • Configure Yugabyte Platform
      • Create admin user
      • Configure the cloud provider
      • Configure the backup target
      • Configure alerts
    • Create deployments
      • Multi-zone universe
      • Multi-region universe
      • Multi-cloud universe
      • Read replica cluster
      • Asynchronous replication
    • Manage deployments
      • Start and stop processes
      • Add a node
      • Eliminate an unresponsive node
      • Enable high availability
      • Edit configuration flags
      • Edit a universe
      • Delete a universe
      • Configure instance tags
      • Upgrade YugabyteDB software
      • Migrate to Helm 3
    • Back up universes
      • Configure backup storage
      • Back up universe data
      • Restore universe data
      • Schedule data backups
    • Security
      • Security checklist
      • Customize ports
      • LDAP authentication
      • Authorization platform
      • Create a KMS configuration
      • Enable encryption at rest
      • Enable encryption in transit (TLS)
      • Network security
    • Alerts and monitoring
      • Alerts
      • Live Queries dashboard
      • Slow Queries dashboard
    • Troubleshoot
      • Install and upgrade issues
      • Universe issues
    • Administer Yugabyte Platform
      • Back Up Yugabyte Platform
      • Authenticate with LDAP
    • Upgrade Yugabyte Platform
      • Upgrade using Replicated
  • YUGABYTE CLOUD
    • Overview
    • Quick start
      • Create a free cluster
      • Connect to the cluster
      • Create a database
      • Explore distributed SQL
      • Build an application
        • Before you begin
        • Java
        • Go
        • Python
        • Node.js
        • C
        • C++
        • C#
        • Ruby
        • Rust
        • PHP
    • Deploy clusters
      • Planning a cluster
      • Create a free cluster
      • Create a standard cluster
      • VPC network
        • Overview
        • Set up a VPC network
        • VPCs
        • Peering Connections
    • Secure clusters
      • IP allow lists
      • Database authorization
      • Add database users
      • Encryption in transit
      • Audit cloud activity
    • Connect to clusters
      • Cloud Shell
      • Client shell
      • Connect applications
    • Alerts and monitoring
      • Alerts
      • Performance metrics
      • Live queries
      • Slow YSQL queries
      • Cluster activity
    • Manage clusters
      • Backup and restore
      • Scale and configure clusters
      • Create extensions
    • Administer Yugabyte Cloud
      • Manage cloud users
      • Manage billing
      • Cluster costs
    • Example applications
      • Connect a Spring application
      • Connect a YCQL Java application
      • Hasura Cloud
      • Deploy a GraphQL application
    • Security architecture
      • Security architecture
      • Shared responsibility model
    • Troubleshoot
    • Yugabyte Cloud FAQ
    • What's new
  • INTEGRATIONS
    • Apache Kafka
    • Apache Spark
    • JanusGraph
    • KairosDB
    • Presto
    • Metabase
    • WSO2 Identity Server
    • YSQL Loader
    • Yugabyte JDBC Driver
    • Prisma
    • Hasura
      • Application Development
      • Benchmarking
    • Spring Framework
      • Spring Data YugabyteDB
      • Spring Data Cassandra
    • Flyway
    • GORM
    • Liquibase
    • Sequelize
    • SQLAlchemy
    • Entity Framework
    • Django REST framework
  • REFERENCE
    • Architecture
      • Design goals
      • Key concepts
        • Universe
        • YB-TServer Service
        • YB-Master Service
      • Core functions
        • Universe creation
        • Table creation
        • Write IO path
        • Read IO path
        • High availability
      • Layered architecture
      • Query layer
        • Overview
      • DocDB transactions layer
        • Transactions overview
        • Transaction isolation levels
        • Explicit locking
        • Read Committed
        • Single-row transactions
        • Distributed transactions
        • Transactional IO path
      • DocDB sharding layer
        • Hash & range sharding
        • Tablet splitting
        • Colocated tables
      • DocDB replication layer
        • Replication
        • xCluster replication
        • Read replicas
        • Change data capture (CDC)
      • DocDB storage layer
        • Persistence
        • Performance
    • APIs
      • YSQL
        • The SQL language
          • SQL statements
            • ABORT
            • ALTER DATABASE
            • ALTER DEFAULT PRIVILEGES
            • ALTER DOMAIN
            • ALTER GROUP
            • ALTER POLICY
            • ALTER ROLE
            • ALTER SEQUENCE
            • ALTER TABLE
            • ALTER USER
            • ANALYZE
            • BEGIN
            • CALL
            • COMMENT
            • COMMIT
            • COPY
            • CREATE AGGREGATE
            • CREATE CAST
            • CREATE DATABASE
            • CREATE DOMAIN
            • CREATE EXTENSION
            • CREATE FUNCTION
            • CREATE GROUP
            • CREATE INDEX
            • CREATE MATERIALIZED VIEW
            • CREATE OPERATOR
            • CREATE OPERATOR CLASS
            • CREATE POLICY
            • CREATE PROCEDURE
            • CREATE ROLE
            • CREATE RULE
            • CREATE SCHEMA
            • CREATE SEQUENCE
            • CREATE TABLE
            • CREATE TABLE AS
            • CREATE TRIGGER
            • CREATE TYPE
            • CREATE USER
            • CREATE VIEW
            • DEALLOCATE
            • DELETE
            • DO
            • DROP AGGREGATE
            • DROP CAST
            • DROP DATABASE
            • DROP DOMAIN
            • DROP EXTENSION
            • DROP FUNCTION
            • DROP GROUP
            • DROP MATERIALIZED VIEW
            • DROP OPERATOR
            • DROP OPERATOR CLASS
            • DROP OWNED
            • DROP POLICY
            • DROP PROCEDURE
            • DROP ROLE
            • DROP RULE
            • DROP SEQUENCE
            • DROP TABLE
            • DROP TRIGGER
            • DROP TYPE
            • DROP USER
            • END
            • EXECUTE
            • EXPLAIN
            • GRANT
            • INSERT
            • LOCK
            • PREPARE
            • REASSIGN OWNED
            • REFRESH MATERIALIZED VIEW
            • RELEASE SAVEPOINT
            • RESET
            • REVOKE
            • ROLLBACK
            • ROLLBACK TO SAVEPOINT
            • SAVEPOINT
            • SELECT
            • SET
            • SET CONSTRAINTS
            • SET ROLE
            • SET SESSION AUTHORIZATION
            • SET TRANSACTION
            • SHOW
            • SHOW TRANSACTION
            • TRUNCATE
            • UPDATE
            • VALUES
          • WITH clause
            • WITH clause—SQL syntax and semantics
            • recursive CTE
            • case study—traversing an employee hierarchy
            • traversing general graphs
              • graph representation
              • common code
              • undirected cyclic graph
              • directed cyclic graph
              • directed acyclic graph
              • rooted tree
              • Unique containing paths
              • Stress testing find_paths()
            • case study—Bacon Numbers from IMDb
              • Bacon numbers for synthetic data
              • Bacon numbers for IMDb data
        • Data types
          • Array
            • array[] constructor
            • Literals
              • Text typecasting and literals
              • Array of primitive values
              • Row
              • Array of rows
            • FOREACH loop (PL/pgSQL)
            • array of DOMAINs
            • Functions and operators
              • ANY and ALL
              • Array comparison
              • Array slice operator
              • Array concatenation
              • Array properties
              • array_agg(), unnest(), generate_subscripts()
              • array_fill()
              • array_position(), array_positions()
              • array_remove()
              • array_replace() / set value
              • array_to_string()
              • string_to_array()
          • Binary
          • Boolean
          • Character
          • Date and time
            • Conceptual background
            • Timezones and UTC offsets
              • Catalog views
              • Extended_timezone_names
                • Unrestricted full projection
                • Real timezones with DST
                • Real timezones no DST
                • Synthetic timezones no DST
              • Offset/timezone-sensitive operations
                • Timestamptz to/from timestamp conversion
                • Pure 'day' interval arithmetic
              • Four ways to specify offset
                • Name-resolution rules
                  • 1 case-insensitive resolution
                  • 2 ~names.abbrev never searched
                  • 3 'set timezone' string not resolved in ~abbrevs.abbrev
                  • 4 ~abbrevs.abbrev before ~names.name
                  • Helper functions
              • Syntax contexts for offset
              • Recommended practice
            • Typecasting between date-time and text-values
            • Semantics of the date-time data types
              • Date data type
              • Time data type
              • Plain timestamp and timestamptz
              • Interval data type
                • Interval representation
                  • Ad hoc examples
                  • Representation model
                • Interval value limits
                • Declaring intervals
                • Justify() and extract(epoch...)
                • Interval arithmetic
                  • Interval-interval comparison
                  • Interval-interval addition and subtraction
                  • Interval-number multiplication
                  • Moment-moment overloads of "-"
                  • Moment-interval overloads of "+" and "-"
                • Custom interval domains
                • Interval utility functions
            • Typecasting between date-time datatypes
            • Operators
              • Test comparison overloads
              • Test addition overloads
              • Test subtraction overloads
              • Test multiplication overloads
              • Test division overloads
            • General-purpose functions
              • Creating date-time values
              • Manipulating date-time values
              • Current date-time moment
              • Delaying execution
              • Miscellaneous
                • Function age()
                • Function extract() | date_part()
                • Implementations that model the overlaps operator
            • Formatting functions
            • Case study—SQL stopwatch
            • Download & install the date-time utilities
            • ToC
          • JSON
            • JSON literals
            • Primitive and compound data types
            • Code example conventions
            • Indexes and check constraints
            • Functions & operators
              • ::jsonb, ::json, ::text (typecast)
              • ->, ->>, #>, #>> (JSON subvalues)
              • - and #- (remove)
              • || (concatenation)
              • = (equality)
              • @> and <@ (containment)
              • ? and ?| and ?& (key or value existence)
              • array_to_json()
              • jsonb_agg()
              • jsonb_array_elements()
              • jsonb_array_elements_text()
              • jsonb_array_length()
              • jsonb_build_object()
              • jsonb_build_array()
              • jsonb_each()
              • jsonb_each_text()
              • jsonb_extract_path()
              • jsonb_extract_path_text() and json_extract_path_text()
              • jsonb_object()
              • jsonb_object_agg()
              • jsonb_object_keys()
              • jsonb_populate_record()
              • jsonb_populate_recordset()
              • jsonb_pretty()
              • jsonb_set() and jsonb_insert()
              • jsonb_strip_nulls()
              • jsonb_to_record()
              • jsonb_to_recordset()
              • jsonb_typeof()
              • row_to_json()
              • to_jsonb()
          • Money
          • Numeric
          • Range
          • Serial
          • UUID
        • Functions and operators
          • Aggregate functions
            • Informal functionality overview
            • Invocation syntax and semantics
            • grouping sets, rollup, cube
            • Per function signature and purpose
              • avg(), count(), max(), min(), sum()
              • array_agg(), string_agg(), jsonb_agg(), jsonb_object_agg()
              • bit_and(), bit_or(), bool_and(), bool_or()
              • variance(), var_pop(), var_samp(), stddev(), stddev_pop(), stddev_samp()
              • linear regression
                • covar_pop(), covar_samp(), corr()
                • regr_%()
              • mode(), percentile_disc(), percentile_cont()
              • rank(), dense_rank(), percent_rank(), cume_dist()
            • case study—percentile_cont() and the "68–95–99.7" rule
            • case study—linear regression on COVID data
              • Download the COVIDcast data
              • Ingest the COVIDcast data
                • Inspect the COVIDcast data
                • Copy the .csv files to staging tables
                • Check staged data conforms to the rules
                • Join the staged data into a single table
                • SQL scripts
                  • Create cr_staging_tables()
                  • Create cr_copy_from_scripts()
                  • Create assert_assumptions_ok()
                  • Create xform_to_covidcast_fb_survey_results()
                  • ingest-the-data.sql
              • Analyze the COVIDcast data
                • symptoms vs mask-wearing by day
                • Data for scatter-plot for 21-Oct-2020
                • Scatter-plot for 21-Oct-2020
                • SQL scripts
                  • analysis-queries.sql
                  • synthetic-data.sql
          • currval()
          • lastval()
          • nextval()
          • Window functions
            • Informal functionality overview
            • Invocation syntax and semantics
            • Per function signature and purpose
              • row_number(), rank() and dense_rank()
              • percent_rank(), cume_dist() and ntile()
              • first_value(), nth_value(), last_value()
              • lag(), lead()
              • Tables for the code examples
                • table t1
                • table t2
                • table t3
                • table t4
            • case study—analyzing a normal distribution
              • Bucket allocation scheme
              • do_clean_start.sql
              • cr_show_t4.sql
              • cr_dp_views.sql
              • cr_int_views.sql
              • cr_pr_cd_equality_report.sql
              • cr_bucket_using_width_bucket.sql
              • cr_bucket_dedicated_code.sql
              • do_assert_bucket_ok
              • cr_histogram.sql
              • cr_do_ntile.sql
              • cr_do_percent_rank.sql
              • cr_do_cume_dist.sql
              • do_populate_results.sql
              • do_report_results.sql
              • do_compare_dp_results.sql
              • do_demo.sql
              • Reports
                • Histogram report
                • dp-results
                • compare-dp-results
                • int-results
          • yb_hash_code()
        • Extensions
        • Keywords
        • Reserved names
      • YCQL
        • ALTER KEYSPACE
        • ALTER ROLE
        • ALTER TABLE
        • CREATE INDEX
        • CREATE KEYSPACE
        • CREATE ROLE
        • CREATE TABLE
        • CREATE TYPE
        • DROP INDEX
        • DROP KEYSPACE
        • DROP ROLE
        • DROP TABLE
        • DROP TYPE
        • GRANT PERMISSION
        • GRANT ROLE
        • REVOKE PERMISSION
        • REVOKE ROLE
        • USE
        • INSERT
        • SELECT
        • EXPLAIN
        • UPDATE
        • DELETE
        • TRANSACTION
        • TRUNCATE
        • Simple expressions
        • Subscripted expressions
        • Function call
        • Operators
        • BLOB
        • BOOLEAN
        • Collection
        • FROZEN
        • INET
        • Integer and counter
        • Non-integer
        • TEXT
        • DATE, TIME, and TIMESTAMP
        • UUID and TIMEUUID
        • JSONB
        • Date and time
        • BATCH
    • CLIs
      • yb-ctl
      • yb-docker-ctl
      • ysqlsh
      • ycqlsh
      • yb-admin
      • yb-ts-cli
      • ysql_dump
      • ysql_dumpall
    • Configuration
      • yb-tserver
      • yb-master
      • yugabyted
      • Default ports
    • Drivers
      • Client drivers for YSQL
      • Client drivers for YCQL
    • Connectors
      • Kafka Connect YugabyteDB
    • Third party tools
      • Arctype
      • DBeaver
      • DbSchema
      • pgAdmin
      • SQL Workbench/J
      • TablePlus
      • Visual Studio Code
    • Sample datasets
      • Chinook
      • Northwind
      • PgExercises
      • SportsDB
      • Retail Analytics
  • RELEASES
    • Releases overview
      • v2.13 series (latest)
      • v2.12 series (stable)
      • v2.11 series
      • v2.9 series
      • v2.8 series
      • v2.7 series
      • v2.6 series
      • v2.5 series
      • v2.4 series
      • v2.3 series
      • v2.2 series
      • v2.1 series
      • v2.0 series
      • v1.3 series
      • v1.2 series
    • Release versioning
  • FAQ
    • Comparisons
      • Amazon Aurora
      • Google Cloud Spanner
      • CockroachDB
      • TiDB
      • Vitess
      • MongoDB
      • FoundationDB
      • Amazon DynamoDB
      • Azure Cosmos DB
      • Apache Cassandra
      • PostgreSQL
      • Redis in-memory store
      • Apache HBase
    • General FAQ
    • Operations FAQ
    • API compatibility FAQ
    • Yugabyte Platform FAQ
  • MISC
    • YEDIS
      • Quick start
      • Develop
        • Build an application
        • C#
        • C++
        • Go
        • Java
        • NodeJS
        • Python
      • API reference
        • APPEND
        • AUTH
        • CONFIG
        • CREATEDB
        • DELETEDB
        • LISTDB
        • SELECT
        • DEL
        • ECHO
        • EXISTS
        • EXPIRE
        • EXPIREAT
        • FLUSHALL
        • FLUSHDB
        • GET
        • GETRANGE
        • GETSET
        • HDEL
        • HEXISTS
        • HGET
        • HGETALL
        • HINCRBY
        • HKEYS
        • HLEN
        • HMGET
        • HMSET
        • HSET
        • HSTRLEN
        • HVALS
        • INCR
        • INCRBY
        • KEYS
        • MONITOR
        • PEXPIRE
        • PEXPIREAT
        • PTTL
        • ROLE
        • SADD
        • SCARD
        • RENAME
        • SET
        • SETEX
        • PSETEX
        • SETRANGE
        • SISMEMBER
        • SMEMBERS
        • SREM
        • STRLEN
        • ZRANGE
        • TSADD
        • TSCARD
        • TSGET
        • TSLASTN
        • TSRANGEBYTIME
        • TSREM
        • TSREVRANGEBYTIME
        • TTL
        • ZADD
        • ZCARD
        • ZRANGEBYSCORE
        • ZREM
        • ZREVRANGE
        • ZSCORE
        • PUBSUB
        • PUBLISH
        • SUBSCRIBE
        • UNSUBSCRIBE
        • PSUBSCRIBE
        • PUNSUBSCRIBE
    • Legal
      • Third party software
> Explore > Multi-region deployments >

Row-Level Geo-Partitioning

Report a doc issue Suggest new content
  • Example scenario
  • Step 1. Create tablespaces
  • Step 2. Create table with partitions
  • Step 3. Pinning user partitions specific to geographic locations
  • Step 4. Users travelling across geographic locations
  • Step 5. Running distributed transactions
  • Step 6. Adding a new geographic location
  • YSQL

Row-level geo-partitioning allows fine-grained control over pinning data in a user table (at a per-row level) to geographic locations, thereby allowing the data residency to be managed at the table-row level. Use-cases requiring low latency multi-region deployments, transactional consistency semantics and transparent schema change propagation across the regions would benefit from this feature.

Moving data closer to users

Geo-partitioning makes it easy for developers to move data closer to users for:

  • Achieving lower latency and higher performance
  • Meeting data residency requirements to comply with regulations such as GDPR

Geo-partitioning of data enables fine-grained, row-level control over the placement of table data across different geographical locations. This is accomplished in three simple steps – first, creating local transaction status tables within each region; second, partitioning a table into user-defined table partitions; and finally, pinning these partitions to the desired geographic locations by configuring metadata for each partition.

  • The first step of creating local transaction tables within each region is done by creating a new transaction table and setting its placement.
  • The second step of creating user-defined table partitions is done by designating a column of the table as the partition column that will be used to geo-partition the data. The value of this column for a given row is used to determine the table partition that the row belongs to.
  • The third step involves creating partitions in the respective geographic locations using tablespaces. Note that the data in each partition can be configured to get replicated across multiple zones in a cloud provider region, or across multiple nearby regions / datacenters.

An entirely new geographic partition can be introduced dynamically by adding a new table partition and configuring it to keep the data resident in the desired geographic location. Data in one or more of the existing geographic locations can be purged efficiently simply by dropping the necessary partitions. Users of traditional RDBMS would recognize this scheme as being close to user-defined list-based table partitions, with the ability to control the geographic location of each partition.

In this deployment, users can access their data with low latencies because the data resides on servers that are geographically close by, and the queries do not need to access data in far away geographic locations.

This tutorial explains this feature in the context of an example scenario described in the next section.

Example scenario

Let us look at this feature in the context of a use case. Say that a large but imaginary bank, Yuga Bank, wants to offer an online banking service to users in many countries by processing their deposits, withdrawals, and transfers.

The following attributes would be required in order to build such a service.

  • Transactional semantics with high availability: Consistency of data is paramount in a banking application, hence the database should be ACID compliant. Additionally, users expect the service to always be available, making high availability and resilience to failures a critical requirement.
  • High performance: The online transactions need to be processed with a low latency in order to ensure a good end-user experience. This requires that the data for a particular user is located in a nearby geographic region. Putting all the data in a single location in an RDBMS would mean the requests for users residing far away from that location would have very high latencies, leading to a poor user experience.
  • Data residency requirements for compliance: Many countries have regulations around which geographic regions the personal data of their residents can be stored in, and bank transactions being personal data are subject to these requirements. For example, GDPR has a data residency stipulation which effectively requires that the personal data of individuals in the EU be stored in the EU. Similarly, India has a requirement issued by the Reserve Bank of India (or RBI for short) making it mandatory for all banks, intermediaries, and other third parties to store all information pertaining to payments data in India – though in case of international transactions, the data on the foreign leg of the transaction can be stored in foreign locations.

Note

While this scenario has regulatory compliance requirements where data needs to be resident in certain geographies, the exact same technique applies for the goal of moving data closer to users in order to achieve low latency and high performance. Hence, high performance is listed above as a requirement.

Step 1. Create tablespaces

First, we create tablespaces for each geographic region we wish to partition data into:

  1. Create tablespaces for each region.

    CREATE TABLESPACE eu_central_1_tablespace WITH (
      replica_placement='{"num_replicas": 3, "placement_blocks":
      [{"cloud":"aws","region":"eu-central-1","zone":"eu-central-1a","min_num_replicas":1},
      {"cloud":"aws","region":"eu-central-1","zone":"eu-central-1b","min_num_replicas":1},
      {"cloud":"aws","region":"eu-central-1","zone":"eu-central-1c","min_num_replicas":1}]}'
    );
    
    CREATE TABLESPACE us_west_2_tablespace WITH (
      replica_placement='{"num_replicas": 3, "placement_blocks":
      [{"cloud":"aws","region":"us-west-2","zone":"us-west-2a","min_num_replicas":1},
      {"cloud":"aws","region":"us-west-2","zone":"us-west-2b","min_num_replicas":1},
      {"cloud":"aws","region":"us-west-2","zone":"us-west-2c","min_num_replicas":1}]}'
    );
    
    CREATE TABLESPACE ap_south_1_tablespace WITH (
      replica_placement='{"num_replicas": 3, "placement_blocks":
      [{"cloud":"aws","region":"ap-south-1","zone":"ap-south-1a","min_num_replicas":1},
      {"cloud":"aws","region":"ap-south-1","zone":"ap-south-1b","min_num_replicas":1},
      {"cloud":"aws","region":"ap-south-1","zone":"ap-south-1c","min_num_replicas":1}]}'
    );
    

Step 2. Create table with partitions

Next, we create the parent table that contains a geo_partition column which is used to create list-based partitions for each geographic region we want to partition data into as shown in the following diagram:

Row-level geo-partitioning

  1. Create the parent table.

    CREATE TABLE bank_transactions (
        user_id   INTEGER NOT NULL,
        account_id INTEGER NOT NULL,
        geo_partition VARCHAR,
        account_type VARCHAR NOT NULL,
        amount NUMERIC NOT NULL,
        txn_type VARCHAR NOT NULL,
        created_at TIMESTAMP DEFAULT NOW()
    ) PARTITION BY LIST (geo_partition);
    
  2. Next, create one partition per desired geography under the parent table, and assign each to the applicable tablespace. Here, you create three table partitions: one for the EU region called bank_transactions_eu, another for the India region called bank_transactions_india, and a third partition for US region called bank_transactions_us.

    CREATE TABLE bank_transactions_eu
        PARTITION OF bank_transactions
          (user_id, account_id, geo_partition, account_type,
          amount, txn_type, created_at,
          PRIMARY KEY (user_id HASH, account_id, geo_partition))
        FOR VALUES IN ('EU') TABLESPACE eu_central_1_tablespace;
    
    CREATE TABLE bank_transactions_india
        PARTITION OF bank_transactions
          (user_id, account_id, geo_partition, account_type,
          amount, txn_type, created_at,
          PRIMARY KEY (user_id HASH, account_id, geo_partition))
        FOR VALUES IN ('India') TABLESPACE ap_south_1_tablespace;
    
    CREATE TABLE bank_transactions_us
        PARTITION OF bank_transactions
          (user_id, account_id, geo_partition, account_type,
          amount, txn_type, created_at,
          PRIMARY KEY (user_id HASH, account_id, geo_partition))
        FOR VALUES IN ('US') TABLESPACE us_west_2_tablespace;
    
  3. Use the \d command to view the table and partitions you've created so far.

    yugabyte=# \d
    
                    List of relations
     Schema |         Name                      | Type  |  Owner
    --------+-----------------------------------+-------+----------
     public | bank_transactions         | table | yugabyte
     public | bank_transactions_eu      | table | yugabyte
     public | bank_transactions_india   | table | yugabyte
     public | bank_transactions_us      | table | yugabyte
    (4 rows)
    

The data is now arranged as follows:

Row-level geo-partitioning

Region-local transaction table

When you create tables using a tablespace with a placement set, YugabyteDB automatically creates a transaction table under the tablespace if one doesn't yet exist, with a name like system.transactions_90141438-f42c-4a39-8a12-4072c1216d46.

Step 3. Pinning user partitions specific to geographic locations

Now, the setup should automatically be able to pin rows to the appropriate regions based on the value set in the geo_partition column. This is shown in the following diagram:

Row-level geo-partitioning

You can test this region pinning by inserting a few rows of data and verifying they are written to the correct partitions.

Expanded output display

The sample output includes expanded auto mode output formatting for better readability. You can enable this mode with the following statement:

yugabyte=# \x auto
Expanded display is used automatically.
  1. Insert a row into the table with the geo_partition column value set to EU below.

    INSERT INTO bank_transactions
        VALUES (100, 10001, 'EU', 'checking', 120.50, 'debit');
    
  2. Verify that the row is present in the bank_transactions table.

    yugabyte=# select * from bank_transactions;
    
    -[ RECORD 1 ]-+---------------------------
    user_id       | 100
    account_id    | 10001
    geo_partition | EU
    account_type  | checking
    amount        | 120.5
    txn_type      | debit
    created_at    | 2020-11-07 21:28:11.056236
    

Additionally, the row must be present only in the bank_transactions_eu partition, which can be easily verified by running the select statement directly against that partition. The other partitions should contain no rows.

yugabyte=# select * from bank_transactions_eu;
-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | EU
account_type  | checking
amount        | 120.5
txn_type      | debit
created_at    | 2020-11-07 21:28:11.056236
yugabyte=# select count(*) from bank_transactions_india;
 count
-------
     0
yugabyte=# select count(*) from bank_transactions_us;
 count
-------
     0

Now, let's insert data into the other partitions.

INSERT INTO bank_transactions
    VALUES (200, 20001, 'India', 'savings', 1000, 'credit');
INSERT INTO bank_transactions
    VALUES (300, 30001, 'US', 'checking', 105.25, 'debit');

These can be verified as follows:

yugabyte=# select * from bank_transactions_india;
-[ RECORD 1 ]-+---------------------------
user_id       | 200
account_id    | 20001
geo_partition | India
account_type  | savings
amount        | 1000
txn_type      | credit
created_at    | 2020-11-07 21:45:26.011636
yugabyte=# select * from bank_transactions_us;
-[ RECORD 1 ]-+---------------------------
user_id       | 300
account_id    | 30001
geo_partition | US
account_type  | checking
amount        | 105.25
txn_type      | debit
created_at    | 2020-11-07 21:45:26.067444

Step 4. Users travelling across geographic locations

In order to make things interesting, let us say user 100, whose first bank transaction was performed in the EU region travels to India and the US, and performs two other bank transactions. This can be simulated by using the following statements.

INSERT INTO bank_transactions
    VALUES (100, 10001, 'India', 'savings', 2000, 'credit');
INSERT INTO bank_transactions
    VALUES (100, 10001, 'US', 'checking', 105, 'debit');

Now, each of the bank transactions would be pinned to the appropriate geographic locations. This can be verified as follows.

yugabyte=# select * from bank_transactions_india where user_id=100;
-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | India
account_type  | savings
amount        | 2000
txn_type      | credit
created_at    | 2020-11-07 21:56:26.760253
yugabyte=# select * from bank_transactions_us where user_id=100;
-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | US
account_type  | checking
amount        | 105
txn_type      | debit
created_at    | 2020-11-07 21:56:26.794173

All the bank transactions made by the user can efficiently be retrieved using the following SQL statement.

yugabyte=# select * from bank_transactions where user_id=100 order by created_at desc;
-[ RECORD 1 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | US
account_type  | checking
amount        | 105
txn_type      | debit
created_at    | 2020-11-07 21:56:26.794173
-[ RECORD 2 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | India
account_type  | savings
amount        | 2000
txn_type      | credit
created_at    | 2020-11-07 21:56:26.760253
-[ RECORD 3 ]-+---------------------------
user_id       | 100
account_id    | 10001
geo_partition | EU
account_type  | checking
amount        | 120.5
txn_type      | debit
created_at    | 2020-11-07 21:28:11.056236

Step 5. Running distributed transactions

So far, we have only been running SELECT and single-row transactions. Geo-partitioning introduces a new complication for general distributed transactions.

Let's say we want to run the following transaction:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10002, 'EU', 'checking', 400.00, 'debit');
INSERT INTO bank_transactions VALUES (100, 10003, 'EU', 'checking', 400.00, 'credit');
COMMIT;

If we attempt to run this while connected to a node in us-west-2, we get an error:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10002, 'EU', 'checking', 400.00, 'debit');
ERROR:  Illegal state: Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64: . Errors from tablet servers: [Illegal state (yb/client/transaction.cc:284): Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64]

Because we have a tablespace with placement set to the us-west-2 region, YugabyteDB assumes by default that we want to run a transaction local to that region (using the automatically-generated transaction status table for the us_west_2_tablespace created in Step 1), but such a transaction cannot modify data outside of us-west-2.

However, if we instead connect to a node in eu-central-1 and run the exact same transaction, we are now able to finish and commit the transaction without error:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10002, 'EU', 'checking', 400.0, 'debit');
INSERT 1 0
INSERT INTO bank_transactions VALUES (100, 10003, 'EU', 'checking', 400.0, 'credit');
INSERT 1 0
COMMIT;
COMMIT

Sometimes though, we might want to run a transaction that writes data to multiple regions, for example:

BEGIN;
INSERT INTO bank_transactions VALUES (100, 10004, 'US', 'checking', 400.00, 'debit');
INSERT INTO bank_transactions VALUES (200, 10005, 'EU', 'checking', 400.00, 'credit');
COMMIT;

Running this transaction will fail whether we run it from us-west-2 or eu-central-1. The solution is to mark the transaction as a global transaction:

SET force_global_transaction = TRUE;
BEGIN;
INSERT INTO bank_transactions VALUES (100, 10004, 'US', 'checking', 400.00, 'debit');
INSERT 1 0
INSERT INTO bank_transactions VALUES (200, 10005, 'EU', 'checking', 400.00, 'credit');
INSERT 1 0
COMMIT;
COMMIT

Setting force_global_transaction = TRUE tells YugabyteDB to use the system.transactions transaction table instead, which is presumed to be globally replicated, and lets us run distributed transactions that span multiple regions.

Global transaction latency

Only force global transactions when necessary. All distributed transactions can run without problems under force_global_transaction = TRUE, but you may have significantly higher latency when committing the transaction, because YugabyteDB must achieve consensus across multiple regions to write to system.transactions. Whenever possible, use the default setting of force_global_transaction = FALSE.

Finally, let's say we want to delete the row we just inserted. If we run the following query connected to eu-central-1 as a local transaction, the query once again errors out:

SET force_global_transaction = FALSE;
DELETE FROM bank_transactions WHERE user_id = 200 AND account_id = 10005;
ERROR:  Illegal state: Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64: . Errors from tablet servers: [Illegal state (yb/client/transaction.cc:284): Nonlocal tablet accessed in local transaction: tablet c5a611afd571455e80450bd553a24a64]

We are attempting to delete from the main table (bank_transactions rather than bank_transactions_eu_west_1) and not specifying the partition column (there's no geo_partition = 'EU' clause). This means that YugabyteDB is unable to tell that the row being deleted is in fact located in eu-central-1. To fix this, we could instead run:

DELETE FROM bank_transactions_eu_west_1 WHERE user_id = 200 AND account_id = 10005;
DELETE 1

Step 6. Adding a new geographic location

Assume that after a while, our fictitious Yuga Bank gets a lot of customers across the globe, and wants to offer the service to residents of Brazil, which also has data residency laws. Thanks to row-level geo-partitioning, this can be accomplished easily. We can simply add a new partition and pin it to the AWS South America (São Paulo) region sa-east-1 as shown below.

First, create the tablespace:

CREATE TABLESPACE sa_east_1_tablespace WITH (
    replica_placement='{"num_replicas": 3, "placement_blocks":
      [{"cloud":"aws","region":"sa-east-1","zone":"sa-east-1a","min_num_replicas":1},
      {"cloud":"aws","region":"sa-east-1","zone":"sa-east-1b","min_num_replicas":1},
      {"cloud":"aws","region":"sa-east-1","zone":"sa-east-1c","min_num_replicas":1}]}'
    );

Then, create the partition for Brazil:

CREATE TABLE bank_transactions_brazil
    PARTITION OF bank_transactions
      (user_id, account_id, geo_partition, account_type,
       amount, txn_type, created_at,
       PRIMARY KEY (user_id HASH, account_id, geo_partition))
    FOR VALUES IN ('Brazil') TABLESPACE sa_east_1_tablespace;

And with that, the new region is ready to store bank transactions of the residents of Brazil.

INSERT INTO bank_transactions
    VALUES (400, 40001, 'Brazil', 'savings', 1000, 'credit');

select * from bank_transactions_brazil;
-[ RECORD 1 ]-+-------------------------
user_id       | 400
account_id    | 40001
geo_partition | Brazil
account_type  | savings
amount        | 1000
txn_type      | credit
created_at    | 2020-11-07 22:09:04.8537
  • Example scenario
  • Step 1. Create tablespaces
  • Step 2. Create table with partitions
  • Step 3. Pinning user partitions specific to geographic locations
  • Step 4. Users travelling across geographic locations
  • Step 5. Running distributed transactions
  • Step 6. Adding a new geographic location
Ask our community
  • Slack
  • Github
  • Forum
  • StackOverflow
Yugabyte
Contact Us
Copyright © 2017-2022 Yugabyte, Inc. All rights reserved.