Skip to main content
Version: 2.3.10

Kudu

Kudu sink connector

Support Kudu Version

  • 1.11.1/1.12.0/1.13.0/1.14.0/1.15.0

Support Those Engines

Spark
Flink
SeaTunnel Zeta

Key Features

Data Type Mapping

SeaTunnel Data TypeKudu Data Type
BOOLEANBOOL
INTINT8
INT16
INT32
BIGINTINT64
DECIMALDECIMAL
FLOATFLOAT
DOUBLEDOUBLE
STRINGSTRING
TIMESTAMPUNIXTIME_MICROS
BYTESBINARY

Sink Options

NameTypeRequiredDefaultDescription
kudu_mastersStringYes-Kudu master address. Separated by ',',such as '192.168.88.110:7051'.
table_nameStringYes-The name of kudu table.
client_worker_countIntNo2 * Runtime.getRuntime().availableProcessors()Kudu worker count. Default value is twice the current number of cpu cores.
client_default_operation_timeout_msLongNo30000Kudu normal operation time out.
client_default_admin_operation_timeout_msLongNo30000Kudu admin operation time out.
enable_kerberosBoolNofalseKerberos principal enable.
kerberos_principalStringNo-Kerberos principal. Note that all zeta nodes require have this file.
kerberos_keytabStringNo-Kerberos keytab. Note that all zeta nodes require have this file.
kerberos_krb5confStringNo-Kerberos krb5 conf. Note that all zeta nodes require have this file.
save_modeStringNo-Storage mode, support overwrite and append.
session_flush_modeStringNoAUTO_FLUSH_SYNCKudu flush mode. Default AUTO_FLUSH_SYNC.
batch_sizeIntNo1024The flush max size (includes all append, upsert and delete records), over this number of records, will flush data. The default value is 100
buffer_flush_intervalIntNo10000The flush interval mills, over this time, asynchronous threads will flush data.
ignore_not_foundBoolNofalseIf true, ignore all not found rows.
ignore_not_duplicateBoolNofalseIf true, ignore all dulicate rows.
common-optionsNo-Source plugin common parameters, please refer to Source Common Options for details.

Task Example

Simple:

The following example refers to a FakeSource named "kudu" cdc write kudu table "kudu_sink_table"


env {
parallelism = 1
job.mode = "BATCH"
}
source {
FakeSource {
plugin_output = "kudu"
schema = {
fields {
id = int
val_bool = boolean
val_int8 = tinyint
val_int16 = smallint
val_int32 = int
val_int64 = bigint
val_float = float
val_double = double
val_decimal = "decimal(16, 1)"
val_string = string
val_unixtime_micros = timestamp
}
}
rows = [
{
kind = INSERT
fields = [1, true, 1, 2, 3, 4, 4.3,5.3,6.3, "NEW", "2020-02-02T02:02:02"]
},
{
kind = INSERT
fields = [2, true, 1, 2, 3, 4, 4.3,5.3,6.3, "NEW", "2020-02-02T02:02:02"]
},
{
kind = INSERT
fields = [3, true, 1, 2, 3, 4, 4.3,5.3,6.3, "NEW", "2020-02-02T02:02:02"]
},
{
kind = UPDATE_BEFORE
fields = [1, true, 1, 2, 3, 4, 4.3,5.3,6.3, "NEW", "2020-02-02T02:02:02"]
},
{
kind = UPDATE_AFTER
fields = [1, true, 2, 2, 3, 4, 4.3,5.3,6.3, "NEW", "2020-02-02T02:02:02"]
},
{
kind = DELETE
fields = [2, true, 1, 2, 3, 4, 4.3,5.3,6.3, "NEW", "2020-02-02T02:02:02"]
}
]
}
}

sink {
kudu{
plugin_input = "kudu"
kudu_masters = "kudu-master-cdc:7051"
table_name = "kudu_sink_table"
enable_kerberos = true
kerberos_principal = "xx@xx.COM"
kerberos_keytab = "xx.keytab"
}
}

Multiple table

example1

env {
parallelism = 1
job.mode = "STREAMING"
checkpoint.interval = 5000
}

source {
Mysql-CDC {
base-url = "jdbc:mysql://127.0.0.1:3306/seatunnel"
username = "root"
password = "******"

table-names = ["seatunnel.role","seatunnel.user","galileo.Bucket"]
}
}

transform {
}

sink {
kudu{
kudu_masters = "kudu-master-cdc:7051"
table_name = "${database_name}_${table_name}_test"
}
}

example2

env {
parallelism = 1
job.mode = "BATCH"
}

source {
Jdbc {
driver = oracle.jdbc.driver.OracleDriver
url = "jdbc:oracle:thin:@localhost:1521/XE"
user = testUser
password = testPassword

table_list = [
{
table_path = "TESTSCHEMA.TABLE_1"
},
{
table_path = "TESTSCHEMA.TABLE_2"
}
]
}
}

transform {
}

sink {
kudu{
kudu_masters = "kudu-master-cdc:7051"
table_name = "${schema_name}_${table_name}_test"
}
}

Changelog

Change Log
ChangeCommitVersion
[Improve] restruct connector common options (#8634)https://github.com/apache/seatunnel/commit/f3499a6ee2.3.10
[Improve][Transform] Rename sql transform table name from 'fake' to 'dual' (#8298)https://github.com/apache/seatunnel/commit/e6169684f2.3.9
[Improve][dist]add shade check rule (#8136)https://github.com/apache/seatunnel/commit/51ef800012.3.9
[Improve][API] Unified tables_configs and table_list (#8100)https://github.com/apache/seatunnel/commit/84c0b8d662.3.9
[Feature][Core] Rename result_table_name/source_table_name to plugin_input/plugin_output (#8072)https://github.com/apache/seatunnel/commit/c7bbd322d2.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)https://github.com/apache/seatunnel/commit/6b7c53d032.3.9
[Improve][Connector] Add multi-table sink option check (#7360)https://github.com/apache/seatunnel/commit/2489f64462.3.7
[Feature][Core] Support using upstream table placeholders in sink options and auto replacement (#7131)https://github.com/apache/seatunnel/commit/c4ca741222.3.6
correct the typo of kudu kerberos config (#6905)https://github.com/apache/seatunnel/commit/fcb8554972.3.6
[Fix][KuduCatalogFactory]: Fix KuduCatalogFactory.optionRule() will throw an Exception (#6787)https://github.com/apache/seatunnel/commit/45a4e15322.3.6
[Feature][Engine] Unify job env parameters (#6003)https://github.com/apache/seatunnel/commit/2410ab38f2.3.4
[Feature][Connector-V2] Support multi-table sink feature for kudu (#5951)https://github.com/apache/seatunnel/commit/82460c0bf2.3.4
[Feature] Add unsupported datatype check for all catalog (#5890)https://github.com/apache/seatunnel/commit/b9791285a2.3.4
[Feature][Kudu] Support multi-table source read (#5878)https://github.com/apache/seatunnel/commit/8d9a0b7d12.3.4
[Improve][Common] Introduce new error define rule (#5793)https://github.com/apache/seatunnel/commit/9d1b2582b2.3.4
[Feature][Connector-V2] Support TableSourceFactory/TableSinkFactory on kudu (#5789)https://github.com/apache/seatunnel/commit/10e791d602.3.4
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755)https://github.com/apache/seatunnel/commit/8de7408102.3.4
[Feature][Kudu] Refactor Kudu functionality and Sink support CDC data. (#5437)https://github.com/apache/seatunnel/commit/22110eb7b2.3.4
[Improve][build] Give the maven module a human readable name (#4114)https://github.com/apache/seatunnel/commit/d7cd601052.3.1
[Improve][Project] Code format with spotless plugin. (#4101)https://github.com/apache/seatunnel/commit/a2ab166562.3.1
[Hotfix][Connector-V2] Fix connector source snapshot state NPE (#4027)https://github.com/apache/seatunnel/commit/e39c4988c2.3.1
[Feature][Connector] add get source method to all source connector (#3846)https://github.com/apache/seatunnel/commit/417178fb82.3.1
[Feature][API & Connector & Doc] add parallelism and column projection interface (#3829)https://github.com/apache/seatunnel/commit/b9164b8ba2.3.1
[Hotfix][OptionRule] Fix option rule about all connectors (#3592)https://github.com/apache/seatunnel/commit/226dc6a112.3.0
[Improve][Connector-V2] Bad smell ToArrayCallWithZeroLengthArrayArgument: (#3577)https://github.com/apache/seatunnel/commit/cc448d98c2.3.0
[Improve][Connector-V2][Kudu] Unified exception for kudu source & sink connector (#3564)https://github.com/apache/seatunnel/commit/273418ddc2.3.0
[Connector][Dependency] Add Miss Dependency Cassandra And Change Kudu Plugin Name (#3432)https://github.com/apache/seatunnel/commit/6ac6a0a0c2.3.0
[Feature][Connector V2] expose configurable options in Kudu (#3365)https://github.com/apache/seatunnel/commit/c422210e22.3.0
[Feature][Core][Connector-V2] Unified The way of setting JobName (#2908)https://github.com/apache/seatunnel/commit/bf2c974842.3.0-beta
remove duplicate ExceptionUtil class (#3037)https://github.com/apache/seatunnel/commit/c9dc7c50c2.3.0-beta
[Improve][all] change Log to @Slf4j (#3001)https://github.com/apache/seatunnel/commit/6016100f12.3.0-beta
[Improve][Connector-V2]Kudu Sink Connector Support to upsert rowhttps://github.com/apache/seatunnel/commit/1ece805ab2.3.0-beta
[DEV][Api] Replace SeaTunnelContext with JobContext and remove singleton pattern (#2706)https://github.com/apache/seatunnel/commit/cbf82f7552.2.0-beta
[#2606]Dependency management split (#2630)https://github.com/apache/seatunnel/commit/fc047be692.2.0-beta
[Connector-V2] Add Kudu source and sink connector (#2254)https://github.com/apache/seatunnel/commit/0483cbc2d2.2.0-beta