Clickhouse
Output plugin : Clickhouse
- Author: InterestingLab
- Homepage: https://interestinglab.github.io/seatunnel-docs
- Version: 1.1.0
Description
Write Rows to ClickHouse via Clickhouse-jdbc. You need to create the corresponding table in advance.
Options
name | type | required | default value |
---|---|---|---|
bulk_size | number | no | 20000 |
clickhouse.* | string | no | - |
database | string | yes | - |
fields | list | yes | - |
host | string | yes | - |
password | string | no | - |
table | string | yes | - |
username | string | no | - |
bulk_size [number]
The number of Rows written to ClickHouse through ClickHouse JDBC. Default is 20000.
database [string]
ClickHouse database.
fields [list]
Field list which need to be written to ClickHouse。
host [string]
ClickHouse hosts, format as hostname:port
cluster [string]
ClickHouse cluster name which the table belongs to, see Distributed
password [string]
ClickHouse password, only used when ClickHouse has authority authentication.
table [string]
ClickHouse table name.
username [string]
ClickHouse username, only used when ClickHouse has authority authentication.
clickhouse [string]
In addition to the above parameters that must be specified for the clickhouse jdbc, you can also specify multiple parameters described in clickhouse-jdbc settings
The way to specify parameters is to use the prefix "clickhouse" before the parameter. For example, socket_timeout
is specified as: clickhouse.socket_timeout = 50000
.If you do not specify these parameters, it will be set the default values according to clickhouse-jdbc.
ClickHouse Data Type Check List
ClickHouse Data Type | Convert Plugin Target Type | SQL Expression | Description |
---|---|---|---|
Date | string | string() | Format of yyyy-MM-dd |
DateTime | string | string() | Format of yyyy-MM-dd HH:mm:ss |
String | string | string() | |
Int8 | integer | int() | |
Uint8 | integer | int() | |
Int16 | integer | int() | |
Uint16 | integer | int() | |
Int32 | integer | int() | |
Uint32 | long | bigint() | |
Int64 | long | bigint() | |
Uint64 | long | bigint() | |
Float32 | float | float() | |
Float64 | double | double() | |
Array(T) | - | - | |
Nullable(T) | depend on T | depend on T |
Examples
clickhouse {
host = "localhost:8123"
clickhouse.socket_timeout = 50000
database = "nginx"
table = "access_msg"
fields = ["date", "datetime", "hostname", "http_code", "data_size", "ua", "request_time"]
username = "username"
password = "password"
bulk_size = 20000
}
distribue table config
ClickHouse {
host = "localhost:8123"
database = "nginx"
table = "access_msg"
cluster = "no_replica_cluster"
fields = ["date", "datetime", "hostname", "http_code", "data_size", "ua", "request_time"]
}
Query system.clusters table info, find out which physic shard node store the table. The single spark partition would only write to a certain ClickHouse node using random policy.