Skip to main content
Version: 2.3.0-beta

FakeSource

FakeSource connector

Description​

The FakeSource is a virtual data source, which randomly generates the number of rows according to the data structure of the user-defined schema, just for some test cases such as type conversion or connector new feature testing

Key features​

Options​

nametyperequireddefault value
schemaconfigyes-
row.numintno5
split.numintno1
split.read-intervallongno1
map.sizeintno5
array.sizeintno5
bytes.lengthintno5
string.lengthintno5
common-optionsno-

schema [config]​

fields [Config]​

The schema of fake data that you want to generate

common options​

Source plugin common parameters, please refer to Source Common Options for details

Examples​

  schema = {
fields {
c_map = "map<string, array<int>>"
c_array = "array<int>"
c_string = string
c_boolean = boolean
c_tinyint = tinyint
c_smallint = smallint
c_int = int
c_bigint = bigint
c_float = float
c_double = double
c_decimal = "decimal(30, 8)"
c_null = "null"
c_bytes = bytes
c_date = date
c_timestamp = timestamp
c_row = {
c_map = "map<string, map<string, string>>"
c_array = "array<int>"
c_string = string
c_boolean = boolean
c_tinyint = tinyint
c_smallint = smallint
c_int = int
c_bigint = bigint
c_float = float
c_double = double
c_decimal = "decimal(30, 8)"
c_null = "null"
c_bytes = bytes
c_date = date
c_timestamp = timestamp
}
}
}

row.num​

The total number of data generated per degree of parallelism

split.num​

the number of splits generated by the enumerator for each degree of parallelism

split.read-interval​

The interval(mills) between two split reads in a reader

map.size​

The size of map type that connector generated

array.size​

The size of array type that connector generated

bytes.length​

The length of bytes type that connector generated

string.length​

The length of string type that connector generated

Example​

FakeSource {
row.num = 10
map.size = 10
array.size = 10
bytes.length = 10
string.length = 10
schema = {
fields {
c_map = "map<string, array<int>>"
c_array = "array<int>"
c_string = string
c_boolean = boolean
c_tinyint = tinyint
c_smallint = smallint
c_int = int
c_bigint = bigint
c_float = float
c_double = double
c_decimal = "decimal(30, 8)"
c_null = "null"
c_bytes = bytes
c_date = date
c_timestamp = timestamp
c_row = {
c_map = "map<string, map<string, string>>"
c_array = "array<int>"
c_string = string
c_boolean = boolean
c_tinyint = tinyint
c_smallint = smallint
c_int = int
c_bigint = bigint
c_float = float
c_double = double
c_decimal = "decimal(30, 8)"
c_null = "null"
c_bytes = bytes
c_date = date
c_timestamp = timestamp
}
}
}
}

Changelog​

2.2.0-beta 2022-09-26​

  • Add FakeSource Source Connector

2.3.0-beta 2022-10-20​

  • [Improve] Supports direct definition of data values(row) (2839)
  • [Improve] Improve fake source connector: (2944)
    • Support user-defined map size
    • Support user-defined array size
    • Support user-defined string length
    • Support user-defined bytes length
  • [Improve] Support multiple splits for fake source connector (2974)
  • [Improve] Supports setting the number of splits per parallelism and the reading interval between two splits (3098)