Alluxio
Output plugin : Alluxio
- Author: InterestingLab
- Homepage: https://interestinglab.github.io/seatunnel-docs
- Version: 1.5.0
Description
Write Rows to Alluxio.
Options
name | type | required | default value |
---|---|---|---|
options | object | no | - |
partition_by | array | no | - |
path | string | yes | - |
path_time_format | string | no | yyyyMMddHHmmss |
save_mode | string | no | error |
format | string | no | json |
options [object]
Custom parameters.
partition_by [array]
Partition the data based on the fields.
path [string]
File path on Alluxio. Start with alluxio://
.
path_time_format [string]
If path
contains time variables, such as xxxx-${now}
, path_time_format
can be used to specify the format of Alluxio path, default is yyyy.MM.dd
. The commonly used time formats are listed below:
Symbol | Description |
---|---|
y | Year |
M | Month |
d | Day of month |
H | Hour in day (0-23) |
m | Minute in hour |
s | Second in minute |
The detailed time format syntax:Java SimpleDateFormat.
save_mode [string]
Save mode, supports overwrite
, append
, ignore
and error
. The detail of save_mode see save-modes.
format [string]
format, supports csv
, json
, parquet
and text
.
Note
if use alluxio with zookeeper, please add below in start-seatunnel.sh
driverJavaOpts="-Dalluxio.user.file.writetype.default=CACHE_THROUGH -Dalluxio.zookeeper.address=your.zookeeper.address:zookeeper.port -Dalluxio.zookeeper.enabled=true"
executorJavaOpts="-Dalluxio.user.file.writetype.default=CACHE_THROUGH -Dalluxio.zookeeper.address=your.zookeeper.address:zookeeper.port -Dalluxio.zookeeper.enabled=true"
or you can also add below in spark{} in seatunnel configuration after 1.5.0
spark.driverJavaOpts="-Dalluxio.user.file.writetype.default=CACHE_THROUGH -Dalluxio.zookeeper.address=your.zookeeper.address:zookeeper.port -Dalluxio.zookeeper.enabled=true"
spark.executorJavaOpts="-Dalluxio.user.file.writetype.default=CACHE_THROUGH -Dalluxio.zookeeper.address=your.zookeeper.address:zookeeper.port -Dalluxio.zookeeper.enabled=true"
Example
alluxio {
path = "alluxio:///var/logs-${now}"
format = "json"
path_time_format = "yyyy.MM.dd"
}