IoTDB
IoTDB sink connector
Support Those Engines
Spark
Flink
SeaTunnel Zeta
Description
Used to write data to IoTDB.
Using Dependency
For Spark/Flink Engine
- You need to ensure that the jdbc driver jar package has been placed in directory
${SEATUNNEL_HOME}/plugins/
.
For SeaTunnel Zeta Engine
- You need to ensure that the jdbc driver jar package has been placed in directory
${SEATUNNEL_HOME}/lib/
.
Key Features
IoTDB supports the exactly-once
feature through idempotent writing. If two pieces of data have
the same key
and timestamp
, the new data will overwrite the old one.
There is a conflict of thrift version between IoTDB and Spark.Therefore, you need to execute rm -f $SPARK_HOME/jars/libthrift*
and cp $IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/
to resolve it.
Supported DataSource Info
Datasource | Supported Versions | Url |
---|---|---|
IoTDB | >= 0.13.0 | localhost:6667 |
Data Type Mapping
IotDB Data Type | SeaTunnel Data Type |
---|---|
BOOLEAN | BOOLEAN |
INT32 | TINYINT |
INT32 | SMALLINT |
INT32 | INT |
INT64 | BIGINT |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
TEXT | STRING |
Sink Options
Name | Type | Required | Default | Description |
---|---|---|---|---|
node_urls | String | Yes | - | IoTDB cluster address, the format is "host1:port" or "host1:port,host2:port" |
username | String | Yes | - | IoTDB user username |
password | String | Yes | - | IoTDB user password |
key_device | String | Yes | - | Specify field name of the IoTDB deviceId in SeaTunnelRow |
key_timestamp | String | No | processing time | Specify field-name of the IoTDB timestamp in SeaTunnelRow. If not specified, use processing-time as timestamp |
key_measurement_fields | Array | No | exclude device & timestamp | Specify field-name of the IoTDB measurement list in SeaTunnelRow. If not specified, include all fields but exclude device & timestamp |
storage_group | Array | No | - | Specify device storage group(path prefix) example: deviceId = ${storage_group} + "." + ${key_device} |
batch_size | Integer | No | 1024 | For batch writing, when the number of buffers reaches the number of batch_size or the time reaches batch_interval_ms , the data will be flushed into the IoTDB |
max_retries | Integer | No | - | The number of retries to flush failed |
retry_backoff_multiplier_ms | Integer | No | - | Using as a multiplier for generating the next delay for backoff |
max_retry_backoff_ms | Integer | No | - | The amount of time to wait before attempting to retry a request to IoTDB |
default_thrift_buffer_size | Integer | No | - | Thrift init buffer size in IoTDB client |
max_thrift_frame_size | Integer | No | - | Thrift max frame size in IoTDB client |
zone_id | string | No | - | java.time.ZoneId in IoTDB client |
enable_rpc_compression | Boolean | No | - | Enable rpc compression in IoTDB client |
connection_timeout_in_ms | Integer | No | - | The maximum time (in ms) to wait when connecting to IoTDB |
common-options | no | - | Sink plugin common parameters, please refer to Sink Common Options for details |
Examples
env {
parallelism = 2
job.mode = "BATCH"
}
source {
FakeSource {
row.num = 16
bigint.template = [1664035200001]
schema = {
fields {
device_name = "string"
temperature = "float"
moisture = "int"
event_ts = "bigint"
c_string = "string"
c_boolean = "boolean"
c_tinyint = "tinyint"
c_smallint = "smallint"
c_int = "int"
c_bigint = "bigint"
c_float = "float"
c_double = "double"
}
}
}
}
Upstream SeaTunnelRow data format is the following:
device_name | temperature | moisture | event_ts | c_string | c_boolean | c_tinyint | c_smallint | c_int | c_bigint | c_float | c_double |
---|---|---|---|---|---|---|---|---|---|---|---|
root.test_group.device_a | 36.1 | 100 | 1664035200001 | abc1 | true | 1 | 1 | 1 | 2147483648 | 1.0 | 1.0 |
root.test_group.device_b | 36.2 | 101 | 1664035200001 | abc2 | false | 2 | 2 | 2 | 2147483649 | 2.0 | 2.0 |
root.test_group.device_c | 36.3 | 102 | 1664035200001 | abc3 | false | 3 | 3 | 3 | 2147483649 | 3.0 | 3.0 |
Case1
only fill required config.
use current processing time as timestamp. and include all fields but exclude device
& timestamp
as measurement fields
sink {
IoTDB {
node_urls = "localhost:6667"
username = "root"
password = "root"
key_device = "device_name" # specify the `deviceId` use device_name field
}
}
Output to IoTDB
data format is the following:
IoTDB> SELECT * FROM root.test_group.* align by device;
+------------------------+------------------------+--------------+-----------+--------------+---------+----------+----------+-----------+------+-----------+--------+---------+
| Time| Device| temperature| moisture| event_ts| c_string| c_boolean| c_tinyint| c_smallint| c_int| c_bigint| c_float| c_double|
+------------------------+------------------------+--------------+-----------+--------------+---------+----------+----------+-----------+------+-----------+--------+---------+
|2023-09-01T00:00:00.001Z|root.test_group.device_a| 36.1| 100| 1664035200001| abc1| true| 1| 1| 1| 2147483648| 1.0| 1.0|
|2023-09-01T00:00:00.001Z|root.test_group.device_b| 36.2| 101| 1664035200001| abc2| false| 2| 2| 2| 2147483649| 2.0| 2.0|
|2023-09-01T00:00:00.001Z|root.test_group.device_c| 36.3| 102| 1664035200001| abc2| false| 3| 3| 3| 2147483649| 3.0| 3.0|
+------------------------+------------------------+--------------+-----------+--------------+---------+---------+-----------+-----------+------+-----------+--------+---------+
Case2
use source event's time
sink {
IoTDB {
node_urls = "localhost:6667"
username = "root"
password = "root"
key_device = "device_name" # specify the `deviceId` use device_name field
key_timestamp = "event_ts" # specify the `timestamp` use event_ts field
}
}
Output to IoTDB
data format is the following:
IoTDB> SELECT * FROM root.test_group.* align by device;
+------------------------+------------------------+--------------+-----------+--------------+---------+----------+----------+-----------+------+-----------+--------+---------+
| Time| Device| temperature| moisture| event_ts| c_string| c_boolean| c_tinyint| c_smallint| c_int| c_bigint| c_float| c_double|
+------------------------+------------------------+--------------+-----------+--------------+---------+----------+----------+-----------+------+-----------+--------+---------+
|2022-09-25T00:00:00.001Z|root.test_group.device_a| 36.1| 100| 1664035200001| abc1| true| 1| 1| 1| 2147483648| 1.0| 1.0|
|2022-09-25T00:00:00.001Z|root.test_group.device_b| 36.2| 101| 1664035200001| abc2| false| 2| 2| 2| 2147483649| 2.0| 2.0|
|2022-09-25T00:00:00.001Z|root.test_group.device_c| 36.3| 102| 1664035200001| abc2| false| 3| 3| 3| 2147483649| 3.0| 3.0|
+------------------------+------------------------+--------------+-----------+--------------+---------+---------+-----------+-----------+------+-----------+--------+---------+
Case3
use source event's time and limit measurement fields
sink {
IoTDB {
node_urls = "localhost:6667"
username = "root"
password = "root"
key_device = "device_name"
key_timestamp = "event_ts"
key_measurement_fields = ["temperature", "moisture"]
}
}
Output to IoTDB
data format is the following:
IoTDB> SELECT * FROM root.test_group.* align by device;
+------------------------+------------------------+--------------+-----------+
| Time| Device| temperature| moisture|
+------------------------+------------------------+--------------+-----------+
|2022-09-25T00:00:00.001Z|root.test_group.device_a| 36.1| 100|
|2022-09-25T00:00:00.001Z|root.test_group.device_b| 36.2| 101|
|2022-09-25T00:00:00.001Z|root.test_group.device_c| 36.3| 102|
+------------------------+------------------------+--------------+-----------+
Changelog
Change Log
Change | Commit | Version |
---|---|---|
[improve] iotdb options (#8965) | https://github.com/apache/seatunnel/commit/6e073935f | 2.3.10 |
[Improve] restruct connector common options (#8634) | https://github.com/apache/seatunnel/commit/f3499a6ee | 2.3.10 |
[Improve][dist]add shade check rule (#8136) | https://github.com/apache/seatunnel/commit/51ef80001 | 2.3.9 |
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786) | https://github.com/apache/seatunnel/commit/6b7c53d03 | 2.3.9 |
[Improve][Common] Introduce new error define rule (#5793) | https://github.com/apache/seatunnel/commit/9d1b2582b | 2.3.4 |
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755) | https://github.com/apache/seatunnel/commit/8de740810 | 2.3.4 |
Support config column/primaryKey/constraintKey in schema (#5564) | https://github.com/apache/seatunnel/commit/eac76b4e5 | 2.3.4 |
[Doc] update iotdb document (#5404) | https://github.com/apache/seatunnel/commit/856aedb3c | 2.3.4 |
[Improve][Connector-V2] Remove scheduler in IoTDB sink (#5270) | https://github.com/apache/seatunnel/commit/299637868 | 2.3.4 |
[Hotfix] Fix com.google.common.base.Preconditions to seatunnel shade one (#5284) | https://github.com/apache/seatunnel/commit/ed5eadcf7 | 2.3.3 |
Merge branch 'dev' into merge/cdc | https://github.com/apache/seatunnel/commit/4324ee191 | 2.3.1 |
[Improve][Project] Code format with spotless plugin. | https://github.com/apache/seatunnel/commit/423b58303 | 2.3.1 |
[improve][api] Refactoring schema parse (#4157) | https://github.com/apache/seatunnel/commit/b2f573a13 | 2.3.1 |
[Improve][build] Give the maven module a human readable name (#4114) | https://github.com/apache/seatunnel/commit/d7cd60105 | 2.3.1 |
[Improve][Project] Code format with spotless plugin. (#4101) | https://github.com/apache/seatunnel/commit/a2ab16656 | 2.3.1 |
[Improve][SourceConnector] Unified schema parameter, update IoTDB sou… (#3896) | https://github.com/apache/seatunnel/commit/a0959c5fd | 2.3.1 |
[Feature][Connector] add get source method to all source connector (#3846) | https://github.com/apache/seatunnel/commit/417178fb8 | 2.3.1 |
[Feature][API & Connector & Doc] add parallelism and column projection interface (#3829) | https://github.com/apache/seatunnel/commit/b9164b8ba | 2.3.1 |
[Hotfix][OptionRule] Fix option rule about all connectors (#3592) | https://github.com/apache/seatunnel/commit/226dc6a11 | 2.3.0 |
[Improve][Connector-V2][Iotdb] Unified exception for iotdb source & sink connector (#3557) | https://github.com/apache/seatunnel/commit/7353fed6d | 2.3.0 |
[Feature][Connector V2] expose configurable options in IoTDB (#3387) | https://github.com/apache/seatunnel/commit/06359ea76 | 2.3.0 |
[Improve][Connector-V2][IotDB]Add IotDB sink parameter check (#3412) | https://github.com/apache/seatunnel/commit/91240a3dc | 2.3.0 |
[Bug][Connector-v2] Fix IoTDB connector sink NPE (#3080) | https://github.com/apache/seatunnel/commit/e5edf0243 | 2.3.0-beta |
[Imporve][Connector-V2] Imporve iotdb connector (#2917) | https://github.com/apache/seatunnel/commit/3da11ce19 | 2.3.0-beta |
[DEV][Api] Replace SeaTunnelContext with JobContext and remove singleton pattern (#2706) | https://github.com/apache/seatunnel/commit/cbf82f755 | 2.2.0-beta |
[#2606]Dependency management split (#2630) | https://github.com/apache/seatunnel/commit/fc047be69 | 2.2.0-beta |
[chore][connector-common] Rename SeatunnelSchema to SeaTunnelSchema (#2538) | https://github.com/apache/seatunnel/commit/7dc2a2738 | 2.2.0-beta |
[Connectors-V2]Support IoTDB Source (#2431) | https://github.com/apache/seatunnel/commit/7b78d6c92 | 2.2.0-beta |
[Feature][Connector-V2] Support IoTDB sink (#2407) | https://github.com/apache/seatunnel/commit/c1bbbd59d | 2.2.0-beta |