Data sources like streams and tables have an associated schema. This schema defines the columns available in the data, just like a the columns in a traditional SQL database table.
Key vs Value columns¶
KsqlDB supports both key and value columns. These map to the data held in the keys and values of the underlying Kafka topic.
Key columns have a
KEY namespace suffix. Key columns have the following restrictions:
* The can only be a single key column, currently.
* The key column must be named
ROWKEY in the KSQL schema.
Value columns have no namespace suffix. There can be one or more value columns amd the value columns can have any name.
For example, the following declares a schema with a single
INT key column and several value
ROWKEY INT KEY, ID BIGINT, STRING NAME, ADDRESS ADDRESS_TYPE
Column and field names must be valid identifiers.
Unquoted identifiers will be treated as upper-case, for example
col0 is equivalent to
must contain only alpha-numeric and underscore characters.
Identifiers containing invalid character, or where case needs to be preserved, can be quoted using
back-tick quotes, for example
SQL data types¶
The following SQL types are supported by ksqlDB:
Supported primitive types are:
BOOLEAN: a binary value
INT: 32-bit signed integer
BIGINT: 64-bit signed integer
DOUBLE: double precision (64-bit) IEEE 754 floating-point number
STRING: a unicode character sequence (UTF8)
DECIMAL type can store numbers with a very large number of digits and perform calculations exactly.
It is recommended for storing monetary amounts and other quantities where exactness is required.
However, arithmetic on decimals is slow compared to integer and floating point types.
DECIMAL types have a precision and scale.
The scale is the number of digits in the fractional part, to the right of the decimal point.
The precision is the total number of significant digits in the whole number, that is,
the number of digits on both sides of the decimal point.
For example, the number
765.937500 has a precision of 9 and a scale of 6.
To declare a column of type
DECIMAL use the syntax:
The precision must be positive, the scale zero or positive.
ARRAY type defines a variable-length array of elements. All elements in the array must be of
the same type.
To declare an
ARRAY use the syntax:
The element-type of an another SQL data type.
For example, the following creates an array of
Instances of an array can be created using the syntax:
ARRAY[value [, value]*]
For example, the following creates an array with three
ARRAY[2, 4, 6]
MAP type defines a variable-length collection of key-value pairs. All keys in the map must be
of the same type. All values in the map must be of the same type.
To declare a
MAP use the syntax:
The key-type must currently be
STRING while the value-type can an any other SQL data type.
For example, the following creates a map with
STRING keys and values:
Instances of a map can be created using the syntax:
MAP(key := value [, key := value]*)
For example, the following creates a map with three key-value pairs:
MAP('a' := 1, 'b' := 2, 'c' := 3)
STRUCT type defines a list of named fields, where each field can have any SQL data type.
To declare a
STRUCT use the syntax:
STRUCT<field-name field-type [, field-name field-type]*>
For example, the following creates a struct with an
INT field called
FOO and a
STRUCT<FOO INT, BAR BOOLEAN>
Instances of a struct can be created using the syntax:
STRUCT(field-name := field-value [, field-name := field-value]*)
For example, the following creates a struct with fields called
BAR and sets their values
STRUCT('FOO' := 10, 'BAR' := true)
KsqlDB supports custom types using the
CREATE TYPE statements.
CREATE TYPE docs for more information.
- schema evolution with ksqlDB and Avro