Kudu
Kudu source connector
Support Kudu Version
- 1.11.1/1.12.0/1.13.0/1.14.0/1.15.0
Support Those Engines
Spark
Flink
SeaTunnel Zeta
Key features
Description
Used to read data from Kudu.
The tested kudu version is 1.11.1.
Data Type Mapping
kudu Data Type | SeaTunnel Data Type |
---|---|
BOOL | BOOLEAN |
INT8 INT16 INT32 | INT |
INT64 | BIGINT |
DECIMAL | DECIMAL |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
STRING | STRING |
UNIXTIME_MICROS | TIMESTAMP |
BINARY | BYTES |
Source Options
Name | Type | Required | Default | Description |
---|---|---|---|---|
kudu_masters | String | Yes | - | Kudu master address. Separated by ',',such as '192.168.88.110:7051'. |
table_name | String | Yes | - | The name of kudu table. |
client_worker_count | Int | No | 2 * Runtime.getRuntime().availableProcessors() | Kudu worker count. Default value is twice the current number of cpu cores. |
client_default_operation_timeout_ms | Long | No | 30000 | Kudu normal operation time out. |
client_default_admin_operation_timeout_ms | Long | No | 30000 | Kudu admin operation time out. |
enable_kerberos | Bool | No | false | Kerberos principal enable. |
kerberos_principal | String | No | - | Kerberos principal. Note that all zeta nodes require have this file. |
kerberos_keytab | String | No | - | Kerberos keytab. Note that all zeta nodes require have this file. |
kerberos_krb5conf | String | No | - | Kerberos krb5 conf. Note that all zeta nodes require have this file. |
scan_token_query_timeout | Long | No | 30000 | The timeout for connecting scan token. If not set, it will be the same as operationTimeout. |
scan_token_batch_size_bytes | Int | No | 1024 * 1024 | Kudu scan bytes. The maximum number of bytes read at a time, the default is 1MB. |
filter | Int | No | 1024 * 1024 | Kudu scan filter expressions,Not supported yet. |
schema | Map | No | 1024 * 1024 | SeaTunnel Schema. |
table_list | Array | No | - | The list of tables to be read. you can use this configuration instead of table_path example: table_list = [{ table_name = "kudu_source_table_1"},{ table_name = "kudu_source_table_2"}] |
common-options | No | - | Source plugin common parameters, please refer to Source Common Options for details. |
Task Example
Simple:
The following example is for a Kudu table named "kudu_source_table", The goal is to print the data from this table on the console and write kudu table "kudu_sink_table"
# Defining the runtime environment
env {
parallelism = 2
job.mode = "BATCH"
}
source {
# This is a example source plugin **only for test and demonstrate the feature source plugin**
kudu {
kudu_masters = "kudu-master:7051"
table_name = "kudu_source_table"
plugin_output = "kudu"
enable_kerberos = true
kerberos_principal = "xx@xx.COM"
kerberos_keytab = "xx.keytab"
}
}
transform {
}
sink {
console {
plugin_input = "kudu"
}
kudu {
plugin_input = "kudu"
kudu_masters = "kudu-master:7051"
table_name = "kudu_sink_table"
enable_kerberos = true
kerberos_principal = "xx@xx.COM"
kerberos_keytab = "xx.keytab"
}
}
Multiple Table
env {
# You can set engine configuration here
parallelism = 1
job.mode = "STREAMING"
checkpoint.interval = 5000
}
source {
# This is a example source plugin **only for test and demonstrate the feature source plugin**
kudu{
kudu_masters = "kudu-master:7051"
table_list = [
{
table_name = "kudu_source_table_1"
},{
table_name = "kudu_source_table_2"
}
]
plugin_output = "kudu"
}
}
transform {
}
sink {
Assert {
rules {
table-names = ["kudu_source_table_1", "kudu_source_table_2"]
}
}
}
Changelog
Change Log
Change | Commit | Version |
---|---|---|
[Improve] restruct connector common options (#8634) | https://github.com/apache/seatunnel/commit/f3499a6ee | 2.3.10 |
[Improve][Transform] Rename sql transform table name from 'fake' to 'dual' (#8298) | https://github.com/apache/seatunnel/commit/e6169684f | 2.3.9 |
[Improve][dist]add shade check rule (#8136) | https://github.com/apache/seatunnel/commit/51ef80001 | 2.3.9 |
[Improve][API] Unified tables_configs and table_list (#8100) | https://github.com/apache/seatunnel/commit/84c0b8d66 | 2.3.9 |
[Feature][Core] Rename result_table_name /source_table_name to plugin_input/plugin_output (#8072) | https://github.com/apache/seatunnel/commit/c7bbd322d | 2.3.9 |
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786) | https://github.com/apache/seatunnel/commit/6b7c53d03 | 2.3.9 |
[Improve][Connector] Add multi-table sink option check (#7360) | https://github.com/apache/seatunnel/commit/2489f6446 | 2.3.7 |
[Feature][Core] Support using upstream table placeholders in sink options and auto replacement (#7131) | https://github.com/apache/seatunnel/commit/c4ca74122 | 2.3.6 |
correct the typo of kudu kerberos config (#6905) | https://github.com/apache/seatunnel/commit/fcb855497 | 2.3.6 |
[Fix][KuduCatalogFactory]: Fix KuduCatalogFactory.optionRule() will throw an Exception (#6787) | https://github.com/apache/seatunnel/commit/45a4e1532 | 2.3.6 |
[Feature][Engine] Unify job env parameters (#6003) | https://github.com/apache/seatunnel/commit/2410ab38f | 2.3.4 |
[Feature][Connector-V2] Support multi-table sink feature for kudu (#5951) | https://github.com/apache/seatunnel/commit/82460c0bf | 2.3.4 |
[Feature] Add unsupported datatype check for all catalog (#5890) | https://github.com/apache/seatunnel/commit/b9791285a | 2.3.4 |
[Feature][Kudu] Support multi-table source read (#5878) | https://github.com/apache/seatunnel/commit/8d9a0b7d1 | 2.3.4 |
[Improve][Common] Introduce new error define rule (#5793) | https://github.com/apache/seatunnel/commit/9d1b2582b | 2.3.4 |
[Feature][Connector-V2] Support TableSourceFactory/TableSinkFactory on kudu (#5789) | https://github.com/apache/seatunnel/commit/10e791d60 | 2.3.4 |
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755) | https://github.com/apache/seatunnel/commit/8de740810 | 2.3.4 |
[Feature][Kudu] Refactor Kudu functionality and Sink support CDC data. (#5437) | https://github.com/apache/seatunnel/commit/22110eb7b | 2.3.4 |
[Improve][build] Give the maven module a human readable name (#4114) | https://github.com/apache/seatunnel/commit/d7cd60105 | 2.3.1 |
[Improve][Project] Code format with spotless plugin. (#4101) | https://github.com/apache/seatunnel/commit/a2ab16656 | 2.3.1 |
[Hotfix][Connector-V2] Fix connector source snapshot state NPE (#4027) | https://github.com/apache/seatunnel/commit/e39c4988c | 2.3.1 |
[Feature][Connector] add get source method to all source connector (#3846) | https://github.com/apache/seatunnel/commit/417178fb8 | 2.3.1 |
[Feature][API & Connector & Doc] add parallelism and column projection interface (#3829) | https://github.com/apache/seatunnel/commit/b9164b8ba | 2.3.1 |
[Hotfix][OptionRule] Fix option rule about all connectors (#3592) | https://github.com/apache/seatunnel/commit/226dc6a11 | 2.3.0 |
[Improve][Connector-V2] Bad smell ToArrayCallWithZeroLengthArrayArgument: (#3577) | https://github.com/apache/seatunnel/commit/cc448d98c | 2.3.0 |
[Improve][Connector-V2][Kudu] Unified exception for kudu source & sink connector (#3564) | https://github.com/apache/seatunnel/commit/273418ddc | 2.3.0 |
[Connector][Dependency] Add Miss Dependency Cassandra And Change Kudu Plugin Name (#3432) | https://github.com/apache/seatunnel/commit/6ac6a0a0c | 2.3.0 |
[Feature][Connector V2] expose configurable options in Kudu (#3365) | https://github.com/apache/seatunnel/commit/c422210e2 | 2.3.0 |
[Feature][Core][Connector-V2] Unified The way of setting JobName (#2908) | https://github.com/apache/seatunnel/commit/bf2c97484 | 2.3.0-beta |
remove duplicate ExceptionUtil class (#3037) | https://github.com/apache/seatunnel/commit/c9dc7c50c | 2.3.0-beta |
[Improve][all] change Log to @Slf4j (#3001) | https://github.com/apache/seatunnel/commit/6016100f1 | 2.3.0-beta |
[Improve][Connector-V2]Kudu Sink Connector Support to upsert row | https://github.com/apache/seatunnel/commit/1ece805ab | 2.3.0-beta |
[DEV][Api] Replace SeaTunnelContext with JobContext and remove singleton pattern (#2706) | https://github.com/apache/seatunnel/commit/cbf82f755 | 2.2.0-beta |
[#2606]Dependency management split (#2630) | https://github.com/apache/seatunnel/commit/fc047be69 | 2.2.0-beta |
[Connector-V2] Add Kudu source and sink connector (#2254) | https://github.com/apache/seatunnel/commit/0483cbc2d | 2.2.0-beta |