File to StarRocks
Use this recipe when you want to import local CSV or text files into StarRocks for fast analytical queries.
Prerequisites
Finish Run your first job.
Install the plugins required by this recipe. Follow Deployment > Download The Connector Plugins, then keep only the plugins below in
config/plugin_config:
--seatunnel-connectors--
connector-file-local
connector-starrocks
--end--
cd "${SEATUNNEL_HOME}"
sh bin/install-plugin.sh
ls connectors | rg 'connector-(file-local|starrocks)'
- Put the MySQL JDBC driver required by the StarRocks sink into
${SEATUNNEL_HOME}/lib, then confirm the jar is visible:
ls "${SEATUNNEL_HOME}/lib" | rg 'mysql-connector'
- Prepare the local input file and make sure the SeaTunnel process can read it:
mkdir -p /tmp/seatunnel/input
cat <<'EOF' > /tmp/seatunnel/input/customers.csv
id,name,city,updated_at
1001,Alice,Shanghai,2026-06-12 10:00:00
1002,Bob,Beijing,2026-06-12 10:05:00
1003,Carol,Hangzhou,2026-06-12 10:10:00
EOF
- Create the target database and table in StarRocks before running the job.
Minimal configuration
This example reads a local CSV file with a header line and writes the rows to an existing StarRocks primary-key table.
Prepare the target table first:
CREATE DATABASE IF NOT EXISTS sync_demo;
CREATE TABLE IF NOT EXISTS sync_demo.customers (
id BIGINT NOT NULL,
name STRING,
city STRING,
updated_at DATETIME
)
ENGINE=OLAP
PRIMARY KEY(id)
DISTRIBUTED BY HASH(id)
PROPERTIES (
"replication_num" = "1"
);
env {
parallelism = 1
job.mode = "BATCH"
}
source {
LocalFile {
plugin_output = "customers_file"
path = "/tmp/seatunnel/input/customers.csv"
file_format_type = "csv"
csv_use_header_line = true
schema = {
fields {
id = bigint
name = string
city = string
updated_at = timestamp
}
}
}
}
sink {
StarRocks {
plugin_input = "customers_file"
nodeUrls = ["starrocks-fe:8030"]
base-url = "jdbc:mysql://starrocks-fe:9030/sync_demo"
username = "root"
password = ""
database = "sync_demo"
table = "customers"
batch_max_rows = 1000
schema_save_mode = "IGNORE"
starrocks.config = {
format = "JSON"
strip_outer_array = true
}
}
}
Run the job
Save the config as config/file-to-starrocks.conf, then run SeaTunnel in local mode:
cd "${SEATUNNEL_HOME}"
./bin/seatunnel.sh --config ./config/file-to-starrocks.conf -m local
Validation result
- Run the job and confirm it finishes without StarRocks stream load errors.
- Check the target table in StarRocks.
SELECT COUNT(*) FROM sync_demo.customers;
SELECT id, name, city, updated_at FROM sync_demo.customers ORDER BY id;
If the imported rows in StarRocks match the file content, the pipeline is working.
Common pitfalls
base-urlis missing even thoughnodeUrlsis configured.- The file has a header row, but
csv_use_header_line = trueis not set. - The source schema does not match the file delimiter or timestamp format.
- The target table was not created before the job. This recipe uses
schema_save_mode = "IGNORE"because the local file source does not provide primary-key metadata for StarRocks auto DDL.