跳到主要内容
版本:2.3.10

ClickhouseFile

Clickhouse文件数据接收器

描述

该接收器使用clickhouse-local程序生成clickhouse数据文件,随后将其发送至clickhouse服务器,这个过程也称为bulkload。该接收器仅支持表引擎为 'Distributed'的表,且internal_replication选项需要设置为true。支持批和流两种模式。

主要特性

提示

你也可以采用JDBC的方式将数据写入Clickhouse。

接收器选项

名称类型是否必须默认值
hoststringyes-
databasestringyes-
tablestringyes-
usernamestringyes-
passwordstringyes-
clickhouse_local_pathstringyes-
sharding_keystringno-
copy_methodstringnoscp
node_free_passwordbooleannofalse
node_passlistno-
node_pass.node_addressstringno-
node_pass.usernamestringno"root"
node_pass.passwordstringno-
compatible_modebooleannofalse
file_fields_delimiterstringno"\t"
file_temp_pathstringno"/tmp/seatunnel/clickhouse-local/file"
key_pathstringno"/tmp/id_rsa"
common-optionsno-

host [string]

ClickHouse集群地址,格式为host:port,允许同时指定多个hosts。例如"host1:8123,host2:8123"

database [string]

ClickHouse数据库名。

table [string]

表名称。

username [string]

连接ClickHouse的用户名。

password [string]

连接ClickHouse的用户密码。

sharding_key [string]

当ClickhouseFile需要拆分数据时,需要考虑的问题是当前数据需要发往哪个节点,默认情况下采用的是随机算法,我们也可以使用'sharding_key'参数为某字段指定对应的分片算法。

clickhouse_local_path [string]

在spark节点上的clickhouse-local程序路径。由于每个任务都会被调用,所以每个spark节点上的clickhouse-local程序路径必须相同。

copy_method [string]

为文件传输指定方法,默认为scp,可选值为scp和rsync。

node_free_password [boolean]

由于seatunnel需要使用scp或者rsync进行文件传输,因此seatunnel需要clickhouse服务端访问权限。如果每个spark节点与clickhouse服务端都配置了免密登录,则可以将此选项配置为true,否则需要在node_pass参数中配置对应节点的密码。

node_pass [list]

用来保存所有clickhouse服务器地址及其对应的访问密码。

node_pass.node_address [string]

clickhouse服务器节点地址。

node_pass.username [string]

clickhouse服务器节点用户名,默认为root。

node_pass.password [string]

clickhouse服务器节点的访问密码。

compatible_mode [boolean]

在低版本的Clickhouse中,clickhouse-local程序不支持--path参数,需要设置该参数来采用其他方式实现--path参数功能。

file_fields_delimiter [string]

ClickHouseFile使用CSV格式来临时保存数据。但如果数据中包含CSV的分隔符,可能会导致程序异常。使用此配置可以避免该情况。配置的值必须正好为一个字符的长度。

file_temp_path [string]

ClickhouseFile本地存储临时文件的目录。

key_path [string]

用于scp或rsync传输文件的私钥路径。

common options

Sink插件常用参数,请参考Sink常用选项获取更多细节信息。

示例

ClickhouseFile {
host = "192.168.0.1:8123"
database = "default"
table = "fake_all"
username = "default"
password = ""
clickhouse_local_path = "/Users/seatunnel/Tool/clickhouse local"
sharding_key = "age"
node_free_password = false
node_pass = [{
node_address = "192.168.0.1"
password = "seatunnel"
}]
}

变更日志

Change Log
ChangeCommitVersion
[Fix][Clickhouse] Parallelism makes data duplicate (#8916)https://github.com/apache/seatunnel/commit/45345f2732.3.10
[Fix][Connector-V2]Fix Descriptions for CUSTOM_SQL in Connector (#8778)https://github.com/apache/seatunnel/commit/96b610eb72.3.10
[improve] update clickhouse connector config option (#8755)https://github.com/apache/seatunnel/commit/b964189b72.3.10
[Fix][Connector-V2] fix starRocks automatically creates tables with comment (#8568)https://github.com/apache/seatunnel/commit/c4cb1fc4a2.3.10
[Fix][Connector-V2] Fixed adding table comments (#8514)https://github.com/apache/seatunnel/commit/edca75b0d2.3.10
[hotfix] fix exceptions caused by operator priority in connector-clickhouse when using sharding_key (#8162)https://github.com/apache/seatunnel/commit/5560e3dab2.3.9
[Imporve][ClickhouseFile] Directly connect to each shard node to obtain the corresponding path (#8449)https://github.com/apache/seatunnel/commit/757641bad2.3.9
[Feature][ClickhouseFile] Support add publicKey to identity (#8351)https://github.com/apache/seatunnel/commit/287b8c8212.3.9
[Improve][ClickhouseFile] Improve rsync log output (#8332)https://github.com/apache/seatunnel/commit/179223e3c2.3.9
[Improve][ClickhouseFile] Added attach sql log for better debugging (#8315)https://github.com/apache/seatunnel/commit/ade428c5f2.3.9
[Chore] delete chinese desc in code (#8306)https://github.com/apache/seatunnel/commit/a50a8b9252.3.9
[Improve][ClickhouseFile Connector] Unified specifying clickhouse file generation path (#8302)https://github.com/apache/seatunnel/commit/455f1ed762.3.9
[Improve][ClickhouseFile] Clickhouse supports option configuration when connecting to shard nodes (#8297)https://github.com/apache/seatunnel/commit/1ded1b6202.3.9
[Imporve][ClickhouseFile] Improve clickhousefile generation parameter configuration (#8293)https://github.com/apache/seatunnel/commit/753e058fe2.3.9
[Improve][ClickhouseFile] ClickhouseFile Connector's rsync transmission supports specifying users (#8236)https://github.com/apache/seatunnel/commit/e012bd0a42.3.9
[Feature][Clickhouse] Support sink savemode (#8086)https://github.com/apache/seatunnel/commit/e6f92fd792.3.9
[Improve][dist]add shade check rule (#8136)https://github.com/apache/seatunnel/commit/51ef800012.3.9
[Fix][Connecotr-V2] Fix clickhouse sink does not support composite primary key (#8021)https://github.com/apache/seatunnel/commit/24d0542592.3.9
[Improve] update clickhouse connector, use factory to create source/sink (#7946)https://github.com/apache/seatunnel/commit/b69fcecee2.3.9
[Fix][Connector-V2] Fixed clickhouse connectors cannot stop under multiple parallelism (#7921)https://github.com/apache/seatunnel/commit/8d9c6a3712.3.9
Bump commons-io:commons-io from 2.11.0 to 2.14.0 in /seatunnel-connectors-v2/connector-clickhouse (#7784)https://github.com/apache/seatunnel/commit/f4393a02b2.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)https://github.com/apache/seatunnel/commit/6b7c53d032.3.9
[Improve] Improve some connectors prepare check error message (#7465)https://github.com/apache/seatunnel/commit/6930a25ed2.3.8
[Improve][Connector-V2] Close all ResultSet after used (#7389)https://github.com/apache/seatunnel/commit/853e973212.3.8
[Feature][Connector-V2][Clickhouse] Add clickhouse.config to the source connector (#7143)https://github.com/apache/seatunnel/commit/f7994d9ae2.3.6
[Improve] Make ClickhouseFileSinker support tables containing materialized columns (#6956)https://github.com/apache/seatunnel/commit/87c6adcc22.3.6
[Improve][Clickhouse] Remove check when set allow_experimental_lightweight_delete false(#6727) (#6728)https://github.com/apache/seatunnel/commit/b25e1b1ae2.3.6
[Improve][Common] Adapt FILE_OPERATION_FAILED to CommonError (#5928)https://github.com/apache/seatunnel/commit/b3dc0bbc22.3.4
[Improve][Connector-V2] Replace CommonErrorCodeDeprecated.JSON_OPERATION_FAILED (#5978)https://github.com/apache/seatunnel/commit/456cd17712.3.4
[Feature][Core] Upgrade flink source translation (#5100)https://github.com/apache/seatunnel/commit/5aabb14a92.3.4
[Improve] Speed up ClickhouseFile Local generate a mmap object (#5822)https://github.com/apache/seatunnel/commit/cf39e29da2.3.4
[Improve][Common] Introduce new error define rule (#5793)https://github.com/apache/seatunnel/commit/9d1b2582b2.3.4
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755)https://github.com/apache/seatunnel/commit/8de7408102.3.4
[Hotfix][connector-v2][clickhouse] Fixed an out-of-order BUG with output data fields of clickhouse-sink (#5346)https://github.com/apache/seatunnel/commit/fce9ddaa22.3.4
[Bugfix][Clickhouse] Fix clickhouse sink flush bug (#5448)https://github.com/apache/seatunnel/commit/cef03f6672.3.4
[Hotfix][Clickhouse] Fix clickhouse old version compatibility (#5326)https://github.com/apache/seatunnel/commit/1da49f5a22.3.4
[Improve][CheckStyle] Remove useless 'SuppressWarnings' annotation of checkstyle. (#5260)https://github.com/apache/seatunnel/commit/51c0d709b2.3.4
[Hotfix] Fix com.google.common.base.Preconditions to seatunnel shade one (#5284)https://github.com/apache/seatunnel/commit/ed5eadcf72.3.3
[Feature][Connector-V2][Clickhouse] Add clickhouse connector time zone key,default system time zone (#5078)https://github.com/apache/seatunnel/commit/309b58d122.3.3
[Bugfix]fix clickhouse source connector read Nullable() type is not null,example:Nullable(Float64) while value is null the result is 0.0 (#5080)https://github.com/apache/seatunnel/commit/cf3d0bba22.3.3
[Feature][Connector-V2][Clickhouse] clickhouse writes with checkpoints (#4999)https://github.com/apache/seatunnel/commit/f8fefa1e52.3.3
[Hotfix][Connector-V2][ClickhouseFile] Fix ClickhouseFile write file failed when field value is null (#4937)https://github.com/apache/seatunnel/commit/06671474c2.3.3
[Hotfix][connector-clickhouse] fix get clickhouse local table name with closing bracket from distributed table engineFull (#4710)https://github.com/apache/seatunnel/commit/e5e0cba262.3.2
[Bug][Connector-V2] Clickhouse File Connector failed to sink to table with settings like storage_policy (#4172)https://github.com/apache/seatunnel/commit/e120dc44b2.3.1
[Improve][build] Give the maven module a human readable name (#4114)https://github.com/apache/seatunnel/commit/d7cd601052.3.1
[Improve][Project] Code format with spotless plugin. (#4101)https://github.com/apache/seatunnel/commit/a2ab166562.3.1
[Bug][Connector-V2] Clickhouse File Connector not support split mode for write data to all shards of distributed table (#4035)https://github.com/apache/seatunnel/commit/3f1dcfc912.3.1
[Hotfix][Connector-V2] Fix connector source snapshot state NPE (#4027)https://github.com/apache/seatunnel/commit/e39c4988c2.3.1
[Hotfix][Connector-v2][Clickhouse] Fix clickhouse write cdc changelog update event (#3951)https://github.com/apache/seatunnel/commit/67e6027972.3.1
[Feature][shade][Jackson] Add seatunnel-jackson module (#3947)https://github.com/apache/seatunnel/commit/5d8862ec92.3.1
[Improve][Connector-V2][Clickhouse] Improve performance (#3910)https://github.com/apache/seatunnel/commit/aeceb855f2.3.1
[Improve][Connector-V2] Remove Clickhouse Fields Config (#3826)https://github.com/apache/seatunnel/commit/74704c3622.3.1
[Improve][Connector-V2][clickhouse] Special characters in column names are supported (#3881)https://github.com/apache/seatunnel/commit/9069609c12.3.1
[Feature][Connector] add get source method to all source connector (#3846)https://github.com/apache/seatunnel/commit/417178fb82.3.1
[Improve][Connector-V2] Change Connector Custom Config Prefix To Map (#3719)https://github.com/apache/seatunnel/commit/ef1b8b1bb2.3.1
[Feature][API & Connector & Doc] add parallelism and column projection interface (#3829)https://github.com/apache/seatunnel/commit/b9164b8ba2.3.1
[Bug][Connector-V2] Fix ClickhouseFile Committer Serializable Problems (#3803)https://github.com/apache/seatunnel/commit/1b26192cb2.3.1
[feature][connector-v2][clickhouse] Support write cdc changelog event in clickhouse sink (#3653)https://github.com/apache/seatunnel/commit/6093c213b2.3.0
[Connector-V2][Clickhouse] Improve Clickhouse File Connector (#3416)https://github.com/apache/seatunnel/commit/e07e9a7cc2.3.0
[Hotfix][OptionRule] Fix option rule about all connectors (#3592)https://github.com/apache/seatunnel/commit/226dc6a112.3.0
[Improve][Connector-V2][Clickhouse] Unified exception for Clickhouse source & sink connector (#3563)https://github.com/apache/seatunnel/commit/04e1743d92.3.0
options in conditional need add to required or optional options (#3501)https://github.com/apache/seatunnel/commit/51d5bcba12.3.0
[Feature][Connector-V2][Clickhouse]Optimize clickhouse connector data type inject (#3471)https://github.com/apache/seatunnel/commit/9bd0fc8ee2.3.0
[improve][connector-v2][clickhouse] Fix DoubleInjectFunction (#3441)https://github.com/apache/seatunnel/commit/9781a6a382.3.0
[feature][api] add option validation for the ReadonlyConfig (#3417)https://github.com/apache/seatunnel/commit/4f824fea32.3.0
[improve][connector] The Factory#factoryIdentifier must be consistent with PluginIdentifierInterface#getPluginName (#3328)https://github.com/apache/seatunnel/commit/d9519d6962.3.0
[Improve][Connector-V2] Add Clickhouse and Assert Source/Sink Factory (#3306)https://github.com/apache/seatunnel/commit/9e4a128382.3.0
[Improve][Clickhouse-V2] Clickhouse Support Geo type (#3141)https://github.com/apache/seatunnel/commit/01cdc4e332.3.0
[Improve][Connector-V2][Clickhouse] Support nest type and array (#3047)https://github.com/apache/seatunnel/commit/97b5727ec2.3.0
[Feature][Connector-V2-Clickhouse] Clickhouse Source random use host when config multi-host (#3108)https://github.com/apache/seatunnel/commit/c9583b7f62.3.0-beta
[Improve][Clickhouse-V2] Clickhouse Support Int128,Int256 Type (#3067)https://github.com/apache/seatunnel/commit/e118ccea02.3.0-beta
[Improve][all] change Log to @Slf4j (#3001)https://github.com/apache/seatunnel/commit/6016100f12.3.0-beta
[Connector-V2][Clickhouse] Fix Clickhouse Type Mapping and Spark Map reconvert Bug (#2767)https://github.com/apache/seatunnel/commit/f0a1f50132.2.0-beta
[DEV][Api] Replace SeaTunnelContext with JobContext and remove singleton pattern (#2706)https://github.com/apache/seatunnel/commit/cbf82f7552.2.0-beta
[#2606]Dependency management split (#2630)https://github.com/apache/seatunnel/commit/fc047be692.2.0-beta
[Feature][Connector-V1 & V2] Support unauthorized ClickHouse (#2393)https://github.com/apache/seatunnel/commit/0e4e2b1232.2.0-beta
[Feature][connector] clickhousefile sink connector support non-root username for fileTransfer (#2263)https://github.com/apache/seatunnel/commit/704661f1f2.2.0-beta
StateT of SeaTunnelSource should extend Serializable (#2214)https://github.com/apache/seatunnel/commit/8c426ef852.2.0-beta
[Bug][connector-v2] When outputting data to clickhouse, a ClassCastException was encountered (#2160)https://github.com/apache/seatunnel/commit/a3a2b5d182.2.0-beta
[API-DRAFT][MERGE] fix merge errorhttps://github.com/apache/seatunnel/commit/736ac01c82.2.0-beta
merge dev to api-drafthttps://github.com/apache/seatunnel/commit/d265597c62.2.0-beta
[api-draft][connector] support Rsync to transfer clickhouse data file (#2080)https://github.com/apache/seatunnel/commit/02a41902a2.2.0-beta
[api-draft][Optimize] Optimize module name (#2062)https://github.com/apache/seatunnel/commit/f79e3112b2.2.0-beta