High availability¶
When you run pull queries, it’s often the case that you need your data to remain available for querying even if one server fails. Because ksqlDB supports clustering, it can remain highly available to support pull queries on replicas of your data, even in the face of partial cluster failures.
High availability is turned off by default, but you can enable it with the following server configuration parameters. These parameters must be turned on for all nodes in your ksqlDB cluster.
- Set
ksql.streams.num.standby.replicas
to a value greater than0
. - Set
ksql.query.pull.enable.standby.reads
totrue
. - Set
ksql.heartbeat.enable
totrue
. - Set
ksql.lag.reporting.enable
totrue
.
In addition, make sure that all of your nodes identify as part of the same
cluster by setting ksql.service.id
to the same value.
Note
In Confluent Cloud, ksqlDB clusters with 8 or 12 CSUs are automatically configured for high availability. High availability can't be enabled for Confluent Cloud clusters with fewer than 8 CSUs.
Controlling consistency¶
Because ksqlDB replicates data between its servers asynchronously, you may want
to bound the potential staleness that your query will tolerate. You can control
this per pull query by using the
ksql.query.pull.max.allowed.offset.lag
parameter. For instance, a value of 10,000
means that results of pull queries
forwarded to servers whose current offset is more than 10,000
positions
behind the end offset of the changelog topic are rejected.
Compatability with Authentication¶
Set the authentication.skip.paths
config with both /lag
and /heartbeat
.
This enables ksqlDB cluster instances to communicate without authenticating between each other.
This configuration prevents the following error.
1 2 |
|
For security reasons, this configuration is recommended to block the traffic from outside your cluster to those endpoints.