跳到主要内容
版本:Next

Doris

Doris 源连接器

支持的引擎

Spark
Flink
SeaTunnel Zeta

主要功能

描述

用于 Apache Doris 的源连接器。

依赖

  1. 你需要下载 jdbc driver jar package 并添加到目录 ${SEATUNNEL_HOME}/plugins/.

对于 SeaTunnel Zeta

  1. 你需要下载 jdbc driver jar package 并添加到目录 ${SEATUNNEL_HOME}/lib/.

支持的数据源信息

数据源支持版本驱动UrlMaven
Doris仅支持Doris2.0及以上版本.---

数据类型映射

Doris 数据类型SeaTunnel 数据类型
INTINT
TINYINTTINYINT
SMALLINTSMALLINT
BIGINTBIGINT
LARGEINTSTRING
BOOLEANBOOLEAN
DECIMALDECIMAL((Get the designated column's specified column size)+1,
(Gets the designated column's number of digits to right of the decimal point.)))
FLOATFLOAT
DOUBLEDOUBLE
CHAR
VARCHAR
STRING
TEXT
STRING
DATEDATE
DATETIME
DATETIME(p)
TIMESTAMP
ARRAYARRAY

源选项

基础配置:

名称类型是否必须默认值描述
fenodesstringyes-FE 地址, 格式:"fe_host:fe_http_port"
usernamestringyes-用户名
passwordstringyes-密码
doris.request.retriesintno3请求Doris FE的重试次数
doris.request.read.timeout.msintno30000
doris.request.connect.timeout.msintno30000
query-portstringno9030Doris查询端口
doris.request.query.timeout.sintno3600Doris扫描数据的超时时间,单位秒
table_liststring-表清单

表清单配置:

名称类型是否必须默认值描述
databasestringyes-数据库
tablestringyes-表名
doris.read.fieldstringno-选择要读取的Doris表字段
doris.filter.querystringno-数据过滤. 格式:"字段 = 值", 例如:doris.filter.query = "F_ID > 2"
doris.batch.sizeintno1024每次能够从BE中读取到的最大行数
doris.exec.mem.limitlongno2147483648单个be扫描请求可以使用的最大内存。默认内存为2G(2147483648)

注意: 当此配置对应于单个表时,您可以将table_list中的配置项展平到外层。

提示

不建议随意修改高级参数

例子

单表

这是一个从doris读取数据后,输出到控制台的例子:

env {
parallelism = 2
job.mode = "BATCH"
}
source{
Doris {
fenodes = "doris_e2e:8030"
username = root
password = ""
database = "e2e_source"
table = "doris_e2e_table"
}
}

transform {
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/transform/sql
}

sink {
Console {}
}

使用doris.read.field参数来选择需要读取的Doris表字段:

env {
parallelism = 2
job.mode = "BATCH"
}
source{
Doris {
fenodes = "doris_e2e:8030"
username = root
password = ""
database = "e2e_source"
table = "doris_e2e_table"
doris.read.field = "F_ID,F_INT,F_BIGINT,F_TINYINT,F_SMALLINT"
}
}

transform {
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/transform/sql
}

sink {
Console {}
}

使用doris.filter.query来过滤数据,参数值将作为过滤条件直接传递到doris:

env {
parallelism = 2
job.mode = "BATCH"
}
source{
Doris {
fenodes = "doris_e2e:8030"
username = root
password = ""
database = "e2e_source"
table = "doris_e2e_table"
doris.filter.query = "F_ID > 2"
}
}

transform {
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/transform/sql
}

sink {
Console {}
}

多表

env{
parallelism = 1
job.mode = "BATCH"
}

source{
Doris {
fenodes = "xxxx:8030"
username = root
password = ""
table_list = [
{
database = "st_source_0"
table = "doris_table_0"
doris.read.field = "F_ID,F_INT,F_BIGINT,F_TINYINT"
doris.filter.query = "F_ID >= 50"
},
{
database = "st_source_1"
table = "doris_table_1"
}
]
}
}

transform {}

sink{
Doris {
fenodes = "xxxx:8030"
schema_save_mode = "RECREATE_SCHEMA"
username = root
password = ""
database = "st_sink"
table = "${table_name}"
sink.enable-2pc = "true"
sink.label-prefix = "test_json"
doris.config = {
format="json"
read_json_by_line="true"
}
}
}

变更日志

Change Log
ChangeCommitVersion
[Improve] doris options (#8745)https://github.com/apache/seatunnel/commit/268d76cbf3dev
[Improve] restruct connector common options (#8634)https://github.com/apache/seatunnel/commit/f3499a6eebdev
[Fix][Connector-V2] fix starRocks automatically creates tables with comment (#8568)https://github.com/apache/seatunnel/commit/c4cb1fc4a3dev
[Fix][Connector-V2] Fixed adding table comments (#8514)https://github.com/apache/seatunnel/commit/edca75b0d6dev
[Fix][Doris] Fix catalog not closed (#8415)https://github.com/apache/seatunnel/commit/2d1db66b9f2.3.9
[Feature]Connector-V2[Doris]Support sink ddl (#8250)https://github.com/apache/seatunnel/commit/ecd8269f2e2.3.9
[Feature][Connector-V2]Support Doris Fe Node HA (#8311)https://github.com/apache/seatunnel/commit/3e86102f472.3.9
[Feature][Core] Support read arrow data (#8137)https://github.com/apache/seatunnel/commit/4710ea0f8d2.3.9
[Feature][Clickhouse] Support sink savemode (#8086)https://github.com/apache/seatunnel/commit/e6f92fd79b2.3.9
[Improve][dist]add shade check rule (#8136)https://github.com/apache/seatunnel/commit/51ef8000162.3.9
[Feature][Doris] Support multi-table source read (#7895)https://github.com/apache/seatunnel/commit/10c37acb342.3.9
[Improve][Connector-V2] Add doris/starrocks create table with comment (#7847)https://github.com/apache/seatunnel/commit/207b8c16fd2.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)https://github.com/apache/seatunnel/commit/6b7c53d03c2.3.9
[Fixbug] doris custom sql work (#7464)https://github.com/apache/seatunnel/commit/5c6a7c69842.3.8
[Improve][API] Move catalog open to SaveModeHandler (#7439)https://github.com/apache/seatunnel/commit/8c2c5c79a12.3.8
[Improve][Connector-V2] Close all ResultSet after used (#7389)https://github.com/apache/seatunnel/commit/853e9732122.3.8
Revert "[Fix][Connector-V2] Fix doris primary key order and fields order are inconsistent (#7377)" (#7402)https://github.com/apache/seatunnel/commit/bb72d917702.3.8
[Fix][Connector-V2] Fix doris primary key order and fields order are inconsistent (#7377)https://github.com/apache/seatunnel/commit/464da8fb9b2.3.7
[Bugfix][Doris-connector] Fix Json serialization, null value causes data error problemhttps://github.com/apache/seatunnel/commit/7b19df585f2.3.7
[Improve][Connector-V2] Improve doris error msg (#7343)https://github.com/apache/seatunnel/commit/16950a67cd2.3.7
[Fix][Doris] Fix the abnormality of deleting data in CDC scenario. (#7315)https://github.com/apache/seatunnel/commit/bb2c9124042.3.7
fix [Bug] Unable to create a source for identifier 'Iceberg'. #7182 (#7279)https://github.com/apache/seatunnel/commit/48974917082.3.7
[Fix][Connector-V2] Fix doris TRANSFER_ENCODING header error (#7267)https://github.com/apache/seatunnel/commit/d8864955842.3.6
[Improve][Doris Connector] Unified serialization method,Use RowToJsonConverter and TextSerializationSchema (#7229)https://github.com/apache/seatunnel/commit/4b3af9bef42.3.6
[Feature][Core] Support using upstream table placeholders in sink options and auto replacement (#7131)https://github.com/apache/seatunnel/commit/c4ca74122c2.3.6
[Improve][Zeta] Move SaveMode behavior to master (#6843)https://github.com/apache/seatunnel/commit/80cf91318d2.3.6
[bugFix][Connector-V2][Doris] The multi-FE configuration is supported (#6341)https://github.com/apache/seatunnel/commit/b6d075194b2.3.6
[Feature][Doris] Add Doris type converter (#6354)https://github.com/apache/seatunnel/commit/51899918432.3.6
[Improve] Improve doris create table template default value (#6720)https://github.com/apache/seatunnel/commit/bd647403142.3.6
[Bug Fix] Sink Doris error status(#6753) (#6755)https://github.com/apache/seatunnel/commit/0ce2c0f2202.3.6
[Improve] Improve doris stream load client side error message (#6688)https://github.com/apache/seatunnel/commit/007a9940e32.3.6
[Fix][Connector-v2] Fix the sql statement error of create table for doris and starrocks (#6679)https://github.com/apache/seatunnel/commit/88263cd69f2.3.6
[Fix][Connector-V2] Fixed doris/starrocks create table sql parse error (#6580)https://github.com/apache/seatunnel/commit/f2ed1fbde02.3.5
[Fix][Connector-V2] Fix doris sink can not be closed when stream load not read any data (#6570)https://github.com/apache/seatunnel/commit/341615f4882.3.5
[Fix][Connector-V2] Fix connector support SPI but without no args constructor (#6551)https://github.com/apache/seatunnel/commit/5f3c9c36a52.3.5
[Improve] Add SaveMode log of process detail (#6375)https://github.com/apache/seatunnel/commit/b0d70ce2242.3.5
[Feature] Support nanosecond in Doris DateTimeV2 type (#6358)https://github.com/apache/seatunnel/commit/76967066bf2.3.5
[Fix][Connector-V2] Fix doris source select fields loss primary key information (#6339)https://github.com/apache/seatunnel/commit/78abe2f2022.3.5
[Improve][API] Unify type system api(data & type) (#5872)https://github.com/apache/seatunnel/commit/b38c7edcc92.3.5
[Fix] Fix doris stream load failed not reported error (#6315)https://github.com/apache/seatunnel/commit/a09a5a2bb82.3.5
[Improve][Connector-V2] Doris stream load use FE instead of BE (#6235)https://github.com/apache/seatunnel/commit/0a7acdce952.3.4
[Feature][Connector-V2][Doris] Add Doris ConnectorV2 Source (#6161)https://github.com/apache/seatunnel/commit/fc2d80382a2.3.4
[Improve] Improve doris sink to random use be (#6132)https://github.com/apache/seatunnel/commit/869417660e2.3.4
[Feature] Support SaveMode on Doris (#6085)https://github.com/apache/seatunnel/commit/b2375fffe82.3.4
[Improve] Add batch flush in doris sink (#6024)https://github.com/apache/seatunnel/commit/2c5b48e9072.3.4
[Fix] Fix DorisCatalog not implement name method (#5988)https://github.com/apache/seatunnel/commit/d4a323efef2.3.4
[Feature][Catalog] Doris Catalog (#5175)https://github.com/apache/seatunnel/commit/1d3e335d8e2.3.4
[Improve][Common] Introduce new error define rule (#5793)https://github.com/apache/seatunnel/commit/9d1b2582b22.3.4
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755)https://github.com/apache/seatunnel/commit/8de74081002.3.4
[Improve][Connector] Add field name to DataTypeConvertor to improve error message (#5782)https://github.com/apache/seatunnel/commit/ab60790f0d2.3.4
[Chore] Using try-with-resources to simplify the code. (#4995)https://github.com/apache/seatunnel/commit/d0aff524252.3.4
[Fix] Fix RestService report NullPointerException (#5319)https://github.com/apache/seatunnel/commit/5d4b3194772.3.4
[feature][doris] Doris factory type (#5061)https://github.com/apache/seatunnel/commit/d952cea43c2.3.3
[Bug][connector-v2][doris] add streamload Content-type for doris URLdecode error (#4880)https://github.com/apache/seatunnel/commit/1b918160212.3.3
[Bug][Connector-V2][Doris] update last checkpoint id when doing snapshot (#4881)https://github.com/apache/seatunnel/commit/0360e7e5182.3.2
[Improve] Add a jobId to the doris label to distinguish between tasks (#4839)https://github.com/apache/seatunnel/commit/6672e940772.3.2
[BUG][Doris] Add a jobId to the doris label to distinguish between tasks (#4853)https://github.com/apache/seatunnel/commit/20ee2faecf2.3.2
[Improve][Connector-V2][Doris]Remove serialization code that is no longer used (#4313)https://github.com/apache/seatunnel/commit/0c0e5f978e2.3.1
[Improve][Connector-V2][Doris] Refactor some Doris Sink code as well as support 2pc and cdc (#4235)https://github.com/apache/seatunnel/commit/7c4005af852.3.1
[Hotfix][Connector][Doris] Fix Content Length header already present (#4277)https://github.com/apache/seatunnel/commit/df82b771532.3.1
[Improve][build] Give the maven module a human readable name (#4114)https://github.com/apache/seatunnel/commit/d7cd6010512.3.1
[Improve][Project] Code format with spotless plugin. (#4101)https://github.com/apache/seatunnel/commit/a2ab1665612.3.1
[Improve][Connector-V2][Doris] Change Doris Config Prefix (#3856)https://github.com/apache/seatunnel/commit/16e39a506b2.3.1
[Feature][Connector-V2][Doris] Add Doris StreamLoad sink connector (#3631)https://github.com/apache/seatunnel/commit/72158be3952.3.0