Version: Next

File to StarRocks

Use this recipe when you want to import local CSV or text files into StarRocks for fast analytical queries.

Prerequisites

Finish Run your first job.
Install the plugins required by this recipe. Follow Deployment > Download The Connector Plugins, then keep only the plugins below in config/plugin_config:

--seatunnel-connectors--
connector-file-local
connector-starrocks
--end--

cd "${SEATUNNEL_HOME}"
sh bin/install-plugin.sh
ls connectors | rg 'connector-(file-local|starrocks)'

Put the MySQL JDBC driver required by the StarRocks sink into ${SEATUNNEL_HOME}/lib, then confirm the jar is visible:

ls "${SEATUNNEL_HOME}/lib" | rg 'mysql-connector'

Prepare the local input file and make sure the SeaTunnel process can read it:

mkdir -p /tmp/seatunnel/input
cat <<'EOF' > /tmp/seatunnel/input/customers.csv
id,name,city,updated_at
1001,Alice,Shanghai,2026-06-12 10:00:00
1002,Bob,Beijing,2026-06-12 10:05:00
1003,Carol,Hangzhou,2026-06-12 10:10:00
EOF

Create the target database and table in StarRocks before running the job.

Minimal configuration

This example reads a local CSV file with a header line and writes the rows to an existing StarRocks primary-key table.

Prepare the target table first:

CREATE DATABASE IF NOT EXISTS sync_demo;

CREATE TABLE IF NOT EXISTS sync_demo.customers (
  id BIGINT NOT NULL,
  name STRING,
  city STRING,
  updated_at DATETIME
)
ENGINE=OLAP
PRIMARY KEY(id)
DISTRIBUTED BY HASH(id)
PROPERTIES (
  "replication_num" = "1"
);

env {
  parallelism = 1
  job.mode = "BATCH"
}

source {
  LocalFile {
    plugin_output = "customers_file"
    path = "/tmp/seatunnel/input/customers.csv"
    file_format_type = "csv"
    csv_use_header_line = true
    schema = {
      fields {
        id = bigint
        name = string
        city = string
        updated_at = timestamp
      }
    }
  }
}

sink {
  StarRocks {
    plugin_input = "customers_file"
    nodeUrls = ["starrocks-fe:8030"]
    base-url = "jdbc:mysql://starrocks-fe:9030/sync_demo"
    username = "root"
    password = ""
    database = "sync_demo"
    table = "customers"
    batch_max_rows = 1000
    schema_save_mode = "IGNORE"
    starrocks.config = {
      format = "JSON"
      strip_outer_array = true
    }
  }
}

Run the job

Save the config as config/file-to-starrocks.conf, then run SeaTunnel in local mode:

cd "${SEATUNNEL_HOME}"
./bin/seatunnel.sh --config ./config/file-to-starrocks.conf -m local

Validation result

Run the job and confirm it finishes without StarRocks stream load errors.
Check the target table in StarRocks.

SELECT COUNT(*) FROM sync_demo.customers;
SELECT id, name, city, updated_at FROM sync_demo.customers ORDER BY id;

If the imported rows in StarRocks match the file content, the pipeline is working.

Common pitfalls

base-url is missing even though nodeUrls is configured.
The file has a header row, but csv_use_header_line = true is not set.
The source schema does not match the file delimiter or timestamp format.
The target table was not created before the job. This recipe uses schema_save_mode = "IGNORE" because the local file source does not provide primary-key metadata for StarRocks auto DDL.

File to StarRocks

Prerequisites​

Minimal configuration​

Run the job​

Validation result​

Common pitfalls​

Related docs​

Prerequisites

Minimal configuration

Run the job

Validation result

Common pitfalls

Related docs