SELECT (Pull Query)
Synopsis¶
1 2 3 4 5 |
|
Description¶
Pulls the current value from the from_item
and terminates. The from_item
can be a
materialized view, a table, or a stream. The result
of this statement is not persisted in a Kafka topic and is printed out
only in the console. Pull queries run with predictably low latency because
materialized views are incrementally updated as new events arrive.
They are a great match for request/response flows. For asynchronous application flows, see
Push Queries.
See pull queries in action
You can execute a pull query by sending an HTTP request to the ksqlDB REST API, and the API responds with a single response.
- Pull queries are expressed using a strict subset of ANSI SQL.
- You can issue a pull query against any table that was created by a CREATE TABLE AS SELECT statement.
- You can issue a pull query against any stream.
- Currently, we do not support pull queries against tables created by using a CREATE TABLE statement.
- Pull queries do not support
JOIN
,PARTITION BY
,GROUP BY
andWINDOW
clauses (but can query materialized tables that contain those clauses). LIMIT
clause supports non-negative integers.
Important
ksqlDB may experience a deadlock if you run multiple pull queries concurrently.
If you experience hanging pull queries, the following mitigations may help.
- Rewrite your persistent queries or add a new persistent query to enable efficient pull queries, ideally key lookups, but at least short-range scans, or decrease state store size.
- Rewrite your table scans as individual key lookups.
- Continue issuing table scans, but issue fewer of them at once.
- Adjust your pull query load to avoid sharp spikes in the pull query request rate all at once.
- Implement timeouts from your client so that if pull queries take too long to run, your client terminates the connection to free up resources and unblock other queries.
WHERE
Clause Guidelines¶
By default, only key lookups are enabled. They have the following requirements:
- Key column(s) must use an equality comparison to a literal (e.g.
KEY = 'abc'
). - On windowed tables,
WINDOWSTART
andWINDOWEND
can be optionally compared to literals. For more information on windowed tables, see Time and Windows in ksqlDB.
You can loosen the restrictions on the WHERE
clause, or eliminate the WHERE
clause altogether,
by enabling table scans in your current CLI session with the command SET 'ksql.query.pull.table.scan.enabled'='true';
.
Table scans can also be enabled by default by setting a server configuration property with
ksql.query.pull.table.scan.enabled=true
. Once table scans are enabled, the following additional expressions are allowed:
- Key column(s) using range comparisons to literals.
- Non-key columns to be used alone, without key references.
- Columns to be compared to other columns.
- References to subsets of columns from a multi-column key.
- Complex expressions without direct column references using UDFs and function calls (e.g.
instr(NAME_COL, 'hello') > 0
).
Note
Table scan based queries are just the next incremental step for ksqlDB pull queries. In future releases, we will continue pushing the envelope of new query capabilities and greater performance and efficiency.
Examples¶
Pull queries¶
The following examples show pull queries against a table named TOP_TEN_RANKS
created by using a CREATE TABLE AS SELECT
statement.
First, create a table named GRADES
by using a CREATE TABLE
statement:
1 2 |
|
Then, create a derived table named TOP_TEN_RANKS
by using a
CREATE TABLE AS SELECT statement:
1 2 3 4 |
|
If you want to look up only the student with ID = 5
in the TOP_TEN_RANKS
table using a pull query:
1 2 |
|
After enabling table scans, you can fetch the current state of your TOP_TEN_RANKS
table using a pull query:
1 |
|
If you want to look up the students whose ranks lie in the range (4, 8)
:
1 2 |
|
STREAM¶
Pull queries against a stream.
First, create a stream named STUDENTS
by using a CREATE STREAM statement:
1 2 |
|
If you want to look up students with ranks greater than 5
you can issue the query:
1 2 |
|
INNER JOIN¶
Pull queries against a table INNER_JOIN
that is created by joining multiple tables:
1 2 |
|
1 2 |
|
1 |
|
You can fetch the current state of your table INNER_JOIN
by using a pull query:
1 |
|
See INNER_JOIN in action
WINDOW¶
Pull queries against a windowed table NUMBER_OF_TESTS
created by aggregating the stream STUDENTS
created above:
1 2 3 4 5 |
|
Look up the number of tests taken by a student with ID='10'
:
1 2 3 |
|
Look up the number of tests taken by a student with ID='10'
in the window range 100 <= WindowStart AND WindowEnd <= 16000
:
1 2 3 |
|
STRUCT output¶
You can output a struct from a query
by using a SELECT statement. The following example creates a struct from a
stream named s1
.
1 |
|
Pull queries with pseudo columns¶
You can use the ROWTIME
pseudo column within pull queries. Below
is an example of issuing a pull query with ROWTIME
in both the
SELECT
and WHERE
clauses.
1 |
|
However, this is disallowed for ROWPARTITION
and ROWOFFSET
.