Clickhouse
Clickhouse sink connector
Description
Used to write data to Clickhouse.
Key features
The Clickhouse sink plug-in can achieve accuracy once by implementing idempotent writing, and needs to cooperate with aggregatingmergetree and other engines that support deduplication.
Options
name | type | required | default value |
---|---|---|---|
host | string | yes | - |
database | string | yes | - |
table | string | yes | - |
username | string | yes | - |
password | string | yes | - |
clickhouse.config | map | no | |
bulk_size | string | no | 20000 |
split_mode | string | no | false |
sharding_key | string | no | - |
primary_key | string | no | - |
support_upsert | boolean | no | false |
allow_experimental_lightweight_delete | boolean | no | false |
common-options | no | - |
host [string]
ClickHouse
cluster address, the format is host:port
, allowing multiple hosts
to be specified. Such as "host1:8123,host2:8123"
.
database [string]
The ClickHouse
database
table [string]
The table name
username [string]
ClickHouse
user username
password [string]
ClickHouse
user password
clickhouse.config [map]
In addition to the above mandatory parameters that must be specified by clickhouse-jdbc
, users can also specify multiple optional parameters, which cover all the parameters provided by clickhouse-jdbc
.
bulk_size [number]
The number of rows written through Clickhouse-jdbc each time, the default is 20000
.
split_mode [boolean]
This mode only support clickhouse table which engine is 'Distributed'.And internal_replication
option
should be true
. They will split distributed table data in seatunnel and perform write directly on each shard. The shard weight define is clickhouse will be
counted.
sharding_key [string]
When use split_mode, which node to send data to is a problem, the default is random selection, but the 'sharding_key' parameter can be used to specify the field for the sharding algorithm. This option only worked when 'split_mode' is true.
primary_key [string]
Mark the primary key column from clickhouse table, and based on primary key execute INSERT/UPDATE/DELETE to clickhouse table
support_upsert [boolean]
Support upsert row by query primary key
allow_experimental_lightweight_delete [boolean]
Allow experimental lightweight delete based on *MergeTree
table engine
common options
Sink plugin common parameters, please refer to Sink Common Options for details
Examples
Simple
sink {
Clickhouse {
host = "localhost:8123"
database = "default"
table = "fake_all"
username = "default"
password = ""
clickhouse.confg = {
max_rows_to_read = "100"
read_overflow_mode = "throw"
}
}
}
Split mode
sink {
Clickhouse {
host = "localhost:8123"
database = "default"
table = "fake_all"
username = "default"
password = ""
# split mode options
split_mode = true
sharding_key = "age"
}
}
CDC(Change data capture)
sink {
Clickhouse {
host = "localhost:8123"
database = "default"
table = "fake_all"
username = "default"
password = ""
# cdc options
primary_key = "id"
support_upsert = true
}
}
CDC(Change data capture) for *MergeTree engine
sink {
Clickhouse {
host = "localhost:8123"
database = "default"
table = "fake_all"
username = "default"
password = ""
# cdc options
primary_key = "id"
support_upsert = true
allow_experimental_lightweight_delete = true
}
}
Changelog
2.2.0-beta 2022-09-26
- Add ClickHouse Sink Connector
2.3.0-beta 2022-10-20
- [Improve] Clickhouse Support Int128,Int256 Type (3067)