JsonPath
JsonPath transform plugin
Description
Support use jsonpath select data
Options
| name | type | required | default value | 
|---|---|---|---|
| columns | Array | Yes | |
| row_error_handle_way | Enum | No | FAIL | 
common options [string]
Transform plugin common parameters, please refer to Transform Plugin for details
row_error_handle_way [Enum]
This option is used to specify the processing method when an error occurs in the row, the default value is FAIL.
- FAIL: When 
FAILis selected, data format error will block and an exception will be thrown. - SKIP: When 
SKIPis selected, data format error will skip this row data. 
columns[array]
option
| name | type | required | default value | 
|---|---|---|---|
| src_field | String | Yes | |
| dest_field | String | Yes | |
| path | String | Yes | |
| dest_type | String | No | String | 
| column_error_handle_way | Enum | No | 
src_field
the json source field you want to parse
Support SeatunnelDateType
- STRING
 - BYTES
 - ARRAY
 - MAP
 - ROW
 
dest_field
after use jsonpath output field
dest_type
the type of dest field
path
Jsonpath
column_error_handle_way [Enum]
This option is used to specify the processing method when an error occurs in the column.
- FAIL: When 
FAILis selected, data format error will block and an exception will be thrown. - SKIP: When 
SKIPis selected, data format error will skip this column data. - SKIP_ROW: When 
SKIP_ROWis selected, data format error will skip this row data. 
Read Json Example
The data read from source is a table like this json:
{
  "data": {
    "c_string": "this is a string",
    "c_boolean": true,
    "c_integer": 42,
    "c_float": 3.14,
    "c_double": 3.14,
    "c_decimal": 10.55,
    "c_date": "2023-10-29",
    "c_datetime": "16:12:43.459",
    "c_array":["item1", "item2", "item3"]
  }
}
Assuming we want to use JsonPath to extract properties.
transform {
  JsonPath {
    source_table_name = "fake"
    result_table_name = "fake1"
    columns = [
     {
        "src_field" = "data"
        "path" = "$.data.c_string"
        "dest_field" = "c1_string"
     },
     {
        "src_field" = "data"
        "path" = "$.data.c_boolean"
        "dest_field" = "c1_boolean"
        "dest_type" = "boolean"
     },
     {
        "src_field" = "data"
        "path" = "$.data.c_integer"
        "dest_field" = "c1_integer"
        "dest_type" = "int"
     },
     {
        "src_field" = "data"
        "path" = "$.data.c_float"
        "dest_field" = "c1_float"
        "dest_type" = "float"
     },
     {
        "src_field" = "data"
        "path" = "$.data.c_double"
        "dest_field" = "c1_double"
        "dest_type" = "double"
     },
      {
         "src_field" = "data"
         "path" = "$.data.c_decimal"
         "dest_field" = "c1_decimal"
         "dest_type" = "decimal(4,2)"
      },
      {
         "src_field" = "data"
         "path" = "$.data.c_date"
         "dest_field" = "c1_date"
         "dest_type" = "date"
      },
      {
         "src_field" = "data"
         "path" = "$.data.c_datetime"
         "dest_field" = "c1_datetime"
         "dest_type" = "time"
      },
            {
         "src_field" = "data"
         "path" = "$.data.c_array"
         "dest_field" = "c1_array"
         "dest_type" = "array<string>"        
      }
    ]
  }
}
Then the data result table fake1 will like this
| data | c1_string | c1_boolean | c1_integer | c1_float | c1_double | c1_decimal | c1_date | c1_datetime | c1_array | 
|---|---|---|---|---|---|---|---|---|---|
| too much content not to show | this is a string | true | 42 | 3.14 | 3.14 | 10.55 | 2023-10-29 | 16:12:43.459 | ["item1", "item2", "item3"] | 
Read SeatunnelRow Example
Suppose a column in a row of data is of type SeatunnelRow and that the name of the column is col
| SeatunnelRow(col) | other | |
|---|---|---|
| name | age | .... | 
| a | 18 | .... | 
The JsonPath transform converts the values of seatunnel into an array,
transform {
  JsonPath {
    source_table_name = "fake"
    result_table_name = "fake1"
  
    row_error_handle_way = FAIL
    columns = [
     {
        "src_field" = "col"
        "path" = "$[0]"
        "dest_field" = "name"
        "dest_type" = "string"
     },
     {
        "src_field" = "col"
        "path" = "$[1]"
        "dest_field" = "age"
        "dest_type" = "int"
     }
    ]
  }
}
Then the data result table fake1 will like this
| name | age | col | other | 
|---|---|---|---|
| a | 18 | ["a",18] | ... | 
Configure error data handle way
You can configure row_error_handle_way and column_error_handle_way to handle abnormal data. Both are optional.
row_error_handle_way is used to handle all data anomalies in the row data, while column_error_handle_way is used to handle data anomalies in a column. It has a higher priority than row_error_handle_way.
Skip error data rows
Configure to skip row data with exceptions in any column
transform {
  JsonPath {
    row_error_handle_way = SKIP
    
    columns = [
     {
        "src_field" = "json_data"
        "path" = "$.f1"
        "dest_field" = "json_data_f1"
     },
     {
        "src_field" = "json_data"
        "path" = "$.f2"
        "dest_field" = "json_data_f2"
     }
    ]
  }
}
Skip error data column
Configure only json_data_f1 column data exceptions to skip and fill in null values, other column data exceptions will continue to throw exception interrupt handlers
transform {
  JsonPath {
    row_error_handle_way = FAIL
    
    columns = [
     {
        "src_field" = "json_data"
        "path" = "$.f1"
        "dest_field" = "json_data_f1"
        
        "column_error_handle_way" = "SKIP"
     },
     {
        "src_field" = "json_data"
        "path" = "$.f2"
        "dest_field" = "json_data_f2"
     }
    ]
  }
}
Skip the row for specified column error
Configure to skip the row of data only for json_data_f1 column data exceptions, and continue to throw exceptions to interrupt the handler for other column data exceptions
transform {
  JsonPath {
    row_error_handle_way = FAIL
    
    columns = [
     {
        "src_field" = "json_data"
        "path" = "$.f1"
        "dest_field" = "json_data_f1"
        
        "column_error_handle_way" = "SKIP_ROW"
     },
     {
        "src_field" = "json_data"
        "path" = "$.f2"
        "dest_field" = "json_data_f2"
     }
    ]
  }
}
Changelog
- Add JsonPath Transform