Skip to main content
Version: Next

Doris

Doris source connector

Support Those Engines

Spark
Flink
SeaTunnel Zeta

Key features

Description

Used to read data from Apache Doris.

Using Dependency

  1. You need to ensure that the jdbc driver jar package has been placed in directory ${SEATUNNEL_HOME}/plugins/.

For SeaTunnel Zeta Engine

  1. You need to ensure that the jdbc driver jar package has been placed in directory ${SEATUNNEL_HOME}/lib/.

Supported DataSource Info

DatasourceSupported versionsDriverUrlMaven
DorisOnly Doris2.0 or later is supported.---

Data Type Mapping

Doris Data typeSeaTunnel Data type
INTINT
TINYINTTINYINT
SMALLINTSMALLINT
BIGINTBIGINT
LARGEINTSTRING
BOOLEANBOOLEAN
DECIMALDECIMAL((Get the designated column's specified column size)+1,
(Gets the designated column's number of digits to right of the decimal point.)))
FLOATFLOAT
DOUBLEDOUBLE
CHAR
VARCHAR
STRING
TEXT
STRING
DATEDATE
DATETIME
DATETIME(p)
TIMESTAMP
ARRAYARRAY

Source Options

Base configuration:

NameTypeRequiredDefaultDescription
fenodesstringyes-FE address, the format is "fe_host:fe_http_port"
usernamestringyes-User username
passwordstringyes-User password
doris.request.retriesintno3Number of retries to send requests to Doris FE.
doris.request.read.timeout.msintno30000
doris.request.connect.timeout.msintno30000
query-portstringno9030Doris QueryPort
doris.request.query.timeout.sintno3600Timeout period of Doris scan data, expressed in seconds.
table_liststring-table list

Table list configuration:

NameTypeRequiredDefaultDescription
databasestringyes-The name of Doris database
tablestringyes-The name of Doris table
doris.read.fieldstringno-Use the 'doris.read.field' parameter to select the doris table columns to read
doris.filter.querystringno-Data filtering in doris. the format is "field = value",example : doris.filter.query = "F_ID > 2"
doris.batch.sizeintno1024The maximum value that can be obtained by reading Doris BE once.
doris.exec.mem.limitlongno2147483648Maximum memory that can be used by a single be scan request. The default memory is 2G (2147483648).

Note: When this configuration corresponds to a single table, you can flatten the configuration items in table_list to the outer layer.

Tips

It is not recommended to modify advanced parameters at will

Example

single table

This is an example of reading a Doris table and writing to Console.

env {
parallelism = 2
job.mode = "BATCH"
}
source{
Doris {
fenodes = "doris_e2e:8030"
username = root
password = ""
database = "e2e_source"
table = "doris_e2e_table"
}
}

transform {
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/transform/sql
}

sink {
Console {}
}

Use the 'doris.read.field' parameter to select the doris table columns to read

env {
parallelism = 2
job.mode = "BATCH"
}
source{
Doris {
fenodes = "doris_e2e:8030"
username = root
password = ""
database = "e2e_source"
table = "doris_e2e_table"
doris.read.field = "F_ID,F_INT,F_BIGINT,F_TINYINT,F_SMALLINT"
}
}

transform {
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/transform/sql
}

sink {
Console {}
}

Use 'doris.filter.query' to filter the data, and the parameter values are passed directly to doris

env {
parallelism = 2
job.mode = "BATCH"
}
source{
Doris {
fenodes = "doris_e2e:8030"
username = root
password = ""
database = "e2e_source"
table = "doris_e2e_table"
doris.filter.query = "F_ID > 2"
}
}

transform {
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/transform/sql
}

sink {
Console {}
}

Multiple table

env{
parallelism = 1
job.mode = "BATCH"
}

source{
Doris {
fenodes = "xxxx:8030"
username = root
password = ""
table_list = [
{
database = "st_source_0"
table = "doris_table_0"
doris.read.field = "F_ID,F_INT,F_BIGINT,F_TINYINT"
doris.filter.query = "F_ID >= 50"
},
{
database = "st_source_1"
table = "doris_table_1"
}
]
}
}

transform {}

sink{
Doris {
fenodes = "xxxx:8030"
schema_save_mode = "RECREATE_SCHEMA"
username = root
password = ""
database = "st_sink"
table = "${table_name}"
sink.enable-2pc = "true"
sink.label-prefix = "test_json"
doris.config = {
format="json"
read_json_by_line="true"
}
}
}

Changelog

Change Log
ChangeCommitVersion
[Improve] doris options (#8745)https://github.com/apache/seatunnel/commit/268d76cbf3dev
[Improve] restruct connector common options (#8634)https://github.com/apache/seatunnel/commit/f3499a6eebdev
[Fix][Connector-V2] fix starRocks automatically creates tables with comment (#8568)https://github.com/apache/seatunnel/commit/c4cb1fc4a3dev
[Fix][Connector-V2] Fixed adding table comments (#8514)https://github.com/apache/seatunnel/commit/edca75b0d6dev
[Fix][Doris] Fix catalog not closed (#8415)https://github.com/apache/seatunnel/commit/2d1db66b9f2.3.9
[Feature]Connector-V2[Doris]Support sink ddl (#8250)https://github.com/apache/seatunnel/commit/ecd8269f2e2.3.9
[Feature][Connector-V2]Support Doris Fe Node HA (#8311)https://github.com/apache/seatunnel/commit/3e86102f472.3.9
[Feature][Core] Support read arrow data (#8137)https://github.com/apache/seatunnel/commit/4710ea0f8d2.3.9
[Feature][Clickhouse] Support sink savemode (#8086)https://github.com/apache/seatunnel/commit/e6f92fd79b2.3.9
[Improve][dist]add shade check rule (#8136)https://github.com/apache/seatunnel/commit/51ef8000162.3.9
[Feature][Doris] Support multi-table source read (#7895)https://github.com/apache/seatunnel/commit/10c37acb342.3.9
[Improve][Connector-V2] Add doris/starrocks create table with comment (#7847)https://github.com/apache/seatunnel/commit/207b8c16fd2.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)https://github.com/apache/seatunnel/commit/6b7c53d03c2.3.9
[Fixbug] doris custom sql work (#7464)https://github.com/apache/seatunnel/commit/5c6a7c69842.3.8
[Improve][API] Move catalog open to SaveModeHandler (#7439)https://github.com/apache/seatunnel/commit/8c2c5c79a12.3.8
[Improve][Connector-V2] Close all ResultSet after used (#7389)https://github.com/apache/seatunnel/commit/853e9732122.3.8
Revert "[Fix][Connector-V2] Fix doris primary key order and fields order are inconsistent (#7377)" (#7402)https://github.com/apache/seatunnel/commit/bb72d917702.3.8
[Fix][Connector-V2] Fix doris primary key order and fields order are inconsistent (#7377)https://github.com/apache/seatunnel/commit/464da8fb9b2.3.7
[Bugfix][Doris-connector] Fix Json serialization, null value causes data error problemhttps://github.com/apache/seatunnel/commit/7b19df585f2.3.7
[Improve][Connector-V2] Improve doris error msg (#7343)https://github.com/apache/seatunnel/commit/16950a67cd2.3.7
[Fix][Doris] Fix the abnormality of deleting data in CDC scenario. (#7315)https://github.com/apache/seatunnel/commit/bb2c9124042.3.7
fix [Bug] Unable to create a source for identifier 'Iceberg'. #7182 (#7279)https://github.com/apache/seatunnel/commit/48974917082.3.7
[Fix][Connector-V2] Fix doris TRANSFER_ENCODING header error (#7267)https://github.com/apache/seatunnel/commit/d8864955842.3.6
[Improve][Doris Connector] Unified serialization method,Use RowToJsonConverter and TextSerializationSchema (#7229)https://github.com/apache/seatunnel/commit/4b3af9bef42.3.6
[Feature][Core] Support using upstream table placeholders in sink options and auto replacement (#7131)https://github.com/apache/seatunnel/commit/c4ca74122c2.3.6
[Improve][Zeta] Move SaveMode behavior to master (#6843)https://github.com/apache/seatunnel/commit/80cf91318d2.3.6
[bugFix][Connector-V2][Doris] The multi-FE configuration is supported (#6341)https://github.com/apache/seatunnel/commit/b6d075194b2.3.6
[Feature][Doris] Add Doris type converter (#6354)https://github.com/apache/seatunnel/commit/51899918432.3.6
[Improve] Improve doris create table template default value (#6720)https://github.com/apache/seatunnel/commit/bd647403142.3.6
[Bug Fix] Sink Doris error status(#6753) (#6755)https://github.com/apache/seatunnel/commit/0ce2c0f2202.3.6
[Improve] Improve doris stream load client side error message (#6688)https://github.com/apache/seatunnel/commit/007a9940e32.3.6
[Fix][Connector-v2] Fix the sql statement error of create table for doris and starrocks (#6679)https://github.com/apache/seatunnel/commit/88263cd69f2.3.6
[Fix][Connector-V2] Fixed doris/starrocks create table sql parse error (#6580)https://github.com/apache/seatunnel/commit/f2ed1fbde02.3.5
[Fix][Connector-V2] Fix doris sink can not be closed when stream load not read any data (#6570)https://github.com/apache/seatunnel/commit/341615f4882.3.5
[Fix][Connector-V2] Fix connector support SPI but without no args constructor (#6551)https://github.com/apache/seatunnel/commit/5f3c9c36a52.3.5
[Improve] Add SaveMode log of process detail (#6375)https://github.com/apache/seatunnel/commit/b0d70ce2242.3.5
[Feature] Support nanosecond in Doris DateTimeV2 type (#6358)https://github.com/apache/seatunnel/commit/76967066bf2.3.5
[Fix][Connector-V2] Fix doris source select fields loss primary key information (#6339)https://github.com/apache/seatunnel/commit/78abe2f2022.3.5
[Improve][API] Unify type system api(data & type) (#5872)https://github.com/apache/seatunnel/commit/b38c7edcc92.3.5
[Fix] Fix doris stream load failed not reported error (#6315)https://github.com/apache/seatunnel/commit/a09a5a2bb82.3.5
[Improve][Connector-V2] Doris stream load use FE instead of BE (#6235)https://github.com/apache/seatunnel/commit/0a7acdce952.3.4
[Feature][Connector-V2][Doris] Add Doris ConnectorV2 Source (#6161)https://github.com/apache/seatunnel/commit/fc2d80382a2.3.4
[Improve] Improve doris sink to random use be (#6132)https://github.com/apache/seatunnel/commit/869417660e2.3.4
[Feature] Support SaveMode on Doris (#6085)https://github.com/apache/seatunnel/commit/b2375fffe82.3.4
[Improve] Add batch flush in doris sink (#6024)https://github.com/apache/seatunnel/commit/2c5b48e9072.3.4
[Fix] Fix DorisCatalog not implement name method (#5988)https://github.com/apache/seatunnel/commit/d4a323efef2.3.4
[Feature][Catalog] Doris Catalog (#5175)https://github.com/apache/seatunnel/commit/1d3e335d8e2.3.4
[Improve][Common] Introduce new error define rule (#5793)https://github.com/apache/seatunnel/commit/9d1b2582b22.3.4
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755)https://github.com/apache/seatunnel/commit/8de74081002.3.4
[Improve][Connector] Add field name to DataTypeConvertor to improve error message (#5782)https://github.com/apache/seatunnel/commit/ab60790f0d2.3.4
[Chore] Using try-with-resources to simplify the code. (#4995)https://github.com/apache/seatunnel/commit/d0aff524252.3.4
[Fix] Fix RestService report NullPointerException (#5319)https://github.com/apache/seatunnel/commit/5d4b3194772.3.4
[feature][doris] Doris factory type (#5061)https://github.com/apache/seatunnel/commit/d952cea43c2.3.3
[Bug][connector-v2][doris] add streamload Content-type for doris URLdecode error (#4880)https://github.com/apache/seatunnel/commit/1b918160212.3.3
[Bug][Connector-V2][Doris] update last checkpoint id when doing snapshot (#4881)https://github.com/apache/seatunnel/commit/0360e7e5182.3.2
[Improve] Add a jobId to the doris label to distinguish between tasks (#4839)https://github.com/apache/seatunnel/commit/6672e940772.3.2
[BUG][Doris] Add a jobId to the doris label to distinguish between tasks (#4853)https://github.com/apache/seatunnel/commit/20ee2faecf2.3.2
[Improve][Connector-V2][Doris]Remove serialization code that is no longer used (#4313)https://github.com/apache/seatunnel/commit/0c0e5f978e2.3.1
[Improve][Connector-V2][Doris] Refactor some Doris Sink code as well as support 2pc and cdc (#4235)https://github.com/apache/seatunnel/commit/7c4005af852.3.1
[Hotfix][Connector][Doris] Fix Content Length header already present (#4277)https://github.com/apache/seatunnel/commit/df82b771532.3.1
[Improve][build] Give the maven module a human readable name (#4114)https://github.com/apache/seatunnel/commit/d7cd6010512.3.1
[Improve][Project] Code format with spotless plugin. (#4101)https://github.com/apache/seatunnel/commit/a2ab1665612.3.1
[Improve][Connector-V2][Doris] Change Doris Config Prefix (#3856)https://github.com/apache/seatunnel/commit/16e39a506b2.3.1
[Feature][Connector-V2][Doris] Add Doris StreamLoad sink connector (#3631)https://github.com/apache/seatunnel/commit/72158be3952.3.0