Presto
Presto is a distributed SQL query engine optimized for ad-hoc analysis at interactive speed. It supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. It has a connector architecture to query data from many data sources.
This document describes how to set up Presto to query YugabyteDB's YCQL tables.
1. Start local cluster
Follow Quick start instructions to run a local YugabyteDB cluster. Test YugabyteDB's Cassandra-compatible API, as documented so that you can confirm that you have a Cassandra-compatible service running on localhost:9042
. Ensure that you have created the keyspace and table, and inserted sample data as described there.
2. Download and configure Presto
Detailed steps are documented here. The following are the minimal setup steps for getting started:
$ wget https://repo1.maven.org/maven2/io/prestosql/presto-server/309/presto-server-309.tar.gz
$ tar xvf presto-server-309.tar.gz
$ cd presto-server-309
Create the “etc”, “etc/catalog”, and “data” directory inside the installation directory
$ mkdir etc
$ mkdir etc/catalog
$ mkdir data
Create node.properties file - replace <username> below
$ cat > etc/node.properties
node.environment=test
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
node.data-dir=/Users/<username>/presto-server-309/data
Press Ctrl-D after you have pasted the file contents.
Create jvm.config file
$ cat > etc/jvm.config
-server
-Xmx6G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
Press Ctrl-D after you have pasted the file contents.
Create config.properties file
$ cat > etc/config.properties
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=4GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://localhost:8080
Press Ctrl-D after you have pasted the file contents.
Create log.properties file
$ cat > etc/log.properties
io.prestosql=INFO
Press Ctrl-D after you have pasted the file contents.
Configure Cassandra connector to YugabyteDB
Create the Cassandra catalog properties file in etc/catalog
directory.
Detailed instructions are here.
$ cat > etc/catalog/cassandra.properties
connector.name=cassandra
cassandra.contact-points=127.0.0.1
Press Ctrl-D after you have pasted the file contents.
3. Download Presto CLI
$ cd ~/presto-server-309/bin
$ wget https://repo1.maven.org/maven2/io/prestosql/presto-cli/309/presto-cli-309-executable.jar
Rename the JAR file to presto
. It is meant to be a self-running binary.
$ mv presto-cli-309-executable.jar presto && chmod +x presto
4. Launch Presto server
$ cd ~/presto-server-309
To run in foreground mode:
$ ./bin/launcher run
To run in background mode:
$ ./bin/launcher start
5. Test Presto queries
Use the presto CLI to run ad-hoc queries:
$ ./bin/presto --server localhost:8080 --catalog cassandra --schema default
Start using myapp
:
presto:default> use myapp;
USE
Show the tables available:
presto:myapp> show tables;
Table
-------
stock_market
(1 row)
Describe a particular table:
presto:myapp> describe stock_market;
Column | Type | Extra | Comment
---------------+---------+-------+---------
stock_symbol | varchar | |
ts | varchar | |
current_price | real | |
(3 rows)
Query with filter
presto:myapp> select * from stock_market where stock_symbol = 'AAPL';
stock_symbol | ts | current_price
--------------+---------------------+---------------
AAPL | 2017-10-26 09:00:00 | 157.41
AAPL | 2017-10-26 10:00:00 | 157.0
(2 rows)
Query with aggregates
presto:myapp> select stock_symbol, avg(current_price) from stock_market group by stock_symbol;
stock_symbol | _col1
--------------+---------
GOOG | 972.235
AAPL | 157.205
FB | 170.365
(3 rows)