跳到主要内容
版本:2.3.10

Hbase

Hbase 数据连接器

描述

将数据输出到hbase

主要特性

选项

名称类型是否必须默认值
zookeeper_quorumstringyes-
tablestringyes-
rowkey_columnlistyes-
family_nameconfigyes-
rowkey_delimiterstringno""
version_columnstringno-
null_modestringnoskip
wal_writebooleanyesfalse
write_buffer_sizestringno8 1024 1024
encodingstringnoutf8
hbase_extra_configstringno-
common-optionsno-
ttllongno-

zookeeper_quorum [string]

hbase的zookeeper集群主机, 示例: "hadoop001:2181,hadoop002:2181,hadoop003:2181"

table [string]

要写入的表名, 例如: "seatunnel"

rowkey_column [list]

行键的列名列表, 例如: ["id", "uuid"]

family_name [config]

字段的列簇名称映射。例如,上游的行如下所示:

idnameage
1tyrantlucifer27

id作为行键和其他写入不同列簇的字段,可以分配

family_name { name = "info1" age = "info2" }

这主要是name写入列簇info1,age写入将写给列簇 info2

如果要将其他字段写入同一列簇,可以分配

family_name { all_columns = "info" }

这意味着所有字段都将写入该列簇 info

rowkey_delimiter [string]

连接多行键的分隔符,默认 ""

version_column [string]

版本列名称,您可以使用它来分配 hbase 记录的时间戳

null_mode [double]

写入 null 值的模式,支持 [ skip , empty], 默认 skip

  • skip: 当字段为 null ,连接器不会将此字段写入 hbase
  • empty: 当字段为null时,连接器将写入并为此字段生成空值

wal_write [boolean]

wal log 写入标志,默认值 false

write_buffer_size [int]

hbase 客户端的写入缓冲区大小,默认 8 1024 1024

encoding [string]

字符串字段的编码,支持[ utf8 , gbk],默认 utf8

hbase_extra_config [config]

hbase扩展配置

ttl [long]

hbase 写入数据 TTL 时间,默认以表设置的TTL为准,单位毫秒

常见选项

Sink 插件常用参数,详见 Sink 常用选项 Sink Common Options

案例


Hbase {
zookeeper_quorum = "hadoop001:2181,hadoop002:2181,hadoop003:2181"
table = "seatunnel_test"
rowkey_column = ["name"]
family_name {
all_columns = seatunnel
}
}

写入多表

env {
# You can set engine configuration here
execution.parallelism = 1
job.mode = "BATCH"
}

source {
FakeSource {
tables_configs = [
{
schema = {
table = "hbase_sink_1"
fields {
name = STRING
c_string = STRING
c_double = DOUBLE
c_bigint = BIGINT
c_float = FLOAT
c_int = INT
c_smallint = SMALLINT
c_boolean = BOOLEAN
time = BIGINT
}
}
rows = [
{
kind = INSERT
fields = ["label_1", "sink_1", 4.3, 200, 2.5, 2, 5, true, 1627529632356]
}
]
},
{
schema = {
table = "hbase_sink_2"
fields {
name = STRING
c_string = STRING
c_double = DOUBLE
c_bigint = BIGINT
c_float = FLOAT
c_int = INT
c_smallint = SMALLINT
c_boolean = BOOLEAN
time = BIGINT
}
}
rows = [
{
kind = INSERT
fields = ["label_2", "sink_2", 4.3, 200, 2.5, 2, 5, true, 1627529632357]
}
]
}
]
}
}

sink {
Hbase {
zookeeper_quorum = "hadoop001:2181,hadoop002:2181,hadoop003:2181"
table = "${table_name}"
rowkey_column = ["name"]
family_name {
all_columns = info
}
}
}

写入指定列族

Hbase {
zookeeper_quorum = "hbase_e2e:2181"
table = "assign_cf_table"
rowkey_column = ["id"]
family_name {
c_double = "cf1"
c_bigint = "cf2"
}
}

变更日志

Change Log
ChangeCommitVersion
[Improve] hbase options (#8923)https://github.com/apache/seatunnel/commit/b6a702b582.3.10
[Improve] restruct connector common options (#8634)https://github.com/apache/seatunnel/commit/f3499a6ee2.3.10
[Improve][dist]add shade check rule (#8136)https://github.com/apache/seatunnel/commit/51ef800012.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)https://github.com/apache/seatunnel/commit/6b7c53d032.3.9
[Fix][Connector-V2] Fix known directory create and delete ignore issues (#7700)https://github.com/apache/seatunnel/commit/e2fb679572.3.8
[Feature][Connector-V2][Hbase] implement hbase catalog (#7516)https://github.com/apache/seatunnel/commit/b978792cb2.3.8
[Feature][Connector-V2] Support multi-table sink feature for HBase (#7169)https://github.com/apache/seatunnel/commit/025fa3bb82.3.8
[hotfix][connector-v2-hbase]fix and optimize hbase source problem (#7148)https://github.com/apache/seatunnel/commit/34a6b8e9f2.3.7
[Improve][hbase] The specified column is written to the specified column family (#5234)https://github.com/apache/seatunnel/commit/49d397c612.3.6
[feature][connector-v2-hbase-sink] Support Connector v2 HBase sink TTL data writing (#7116)https://github.com/apache/seatunnel/commit/adafd80252.3.6
[E2E][HBase]Refactor hbase e2e (#6859)https://github.com/apache/seatunnel/commit/1da9bd6ce2.3.6
[Connector]Add hbase source connector (#6348)https://github.com/apache/seatunnel/commit/f108a5e652.3.6
[Feature][HbaseSink]support array data. (#6100)https://github.com/apache/seatunnel/commit/b592014762.3.4
[Improve][Common] Introduce new error define rule (#5793)https://github.com/apache/seatunnel/commit/9d1b2582b2.3.4
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755)https://github.com/apache/seatunnel/commit/8de7408102.3.4
[Hotfix][Connector-v2][HbaseSink]Fix default timestamp (#4958)https://github.com/apache/seatunnel/commit/3d8f3bf902.3.3
[Improve][build] Give the maven module a human readable name (#4114)https://github.com/apache/seatunnel/commit/d7cd601052.3.1
[Improve][Project] Code format with spotless plugin. (#4101)https://github.com/apache/seatunnel/commit/a2ab166562.3.1
[Feature][Connector-V2][Hbase] Introduce hbase sink connector (#4049)https://github.com/apache/seatunnel/commit/68bda94a42.3.1