File
Source plugin : File [Flink]
Description
Read data from the file system
Options
| name | type | required | default value | 
|---|---|---|---|
| format.type | string | yes | - | 
| path | string | yes | - | 
| schema | string | yes | - | 
| common-options | string | no | - | 
| parallelism | int | no | - | 
format.type [string]
The format for reading files from the file system, currently supports csv , json , parquet , orc and text .
path [string]
The file path is required. The hdfs file starts with hdfs:// , and the local file starts with file:// .
schema [string]
- csv - The schemaofcsvis a string ofjsonArray, such as"[{\"type\":\"long\"},{\"type\":\"string\"}]", this can only specify the type of the field , The field name cannot be specified, and the common configuration parameterfield_nameis generally required.
 
- The 
- json - The schemaparameter ofjsonis to provide ajson stringof the original data, and theschemacan be automatically generated, but the original data with the most complete content needs to be provided, otherwise the fields will be lost.
 
- The 
- parquet - The schemaofparquetis anAvro schema string, such as{\"type\":\"record\",\"name\":\"test\",\"fields\":[{\"name\" :\"a\",\"type\":\"int\"},{\"name\":\"b\",\"type\":\"string\"}]}.
 
- The 
- orc - The schemaoforcis the string oforc schema, such as"struct<name:string,addresses:array<struct<street:string,zip:smallint>>>".
 
- The 
- text - The schemaoftextcan be filled withstring.
 
- The 
common options [string]
Source plugin common parameters, please refer to Source Plugin for details
parallelism [Int]
The parallelism of an individual operator, for FileSource
Examples
  FileSource{
    path = "hdfs://localhost:9000/input/"
    format.type = "json"
    schema = "{\"data\":[{\"a\":1,\"b\":2},{\"a\":3,\"b\":4}],\"db\":\"string\",\"q\":{\"s\":\"string\"}}"
    result_table_name = "test"
  }