Introduction to Connector Basics
GeaFlow supports reading and writing data from various connectors. GeaFlow identifies them as external tables and stores the metadata in the Catalog.
Syntax
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] table (
id BIGINT,
name VARCHAR,
age INT
) WITH (
type='file',
geaflow.dsl.file.path = '/path/to/file',
geaflow.dsl.window.size = 1000
)
Declare a table to use a Connector, and the specific use as a Source/Sink is determined by the read/write behavior.
TEMPORARY is used to create a temporary table that is not stored in the Catalog.
If IF NOT EXISTS is not specified, an error will be thrown if a table with the same name already exists.
The WITH clause is used to specify the configuration information for the Connector. The type field is mandatory to specify the type of Connector to be used, for example, 'file' represents using a file.
Additionally, we can add table parameters in the WITH clause. These parameters will override the external (SQL file, job parameters) configurations and have the highest priority.
Common Options
Key | Required | Description |
---|---|---|
type | true | Specifies the type of Connector to be used |
geaflow.dsl.window.size | false | Important. -1 indicates reading all data as one window, which is batch processing. A positive value indicates several data as one window, which is stream processing. |
geaflow.dsl.partitions.per.source.parallelism | false | Groups several shards of the Source together to reduce the resource usage associated with concurrency. |
Example
CREATE TABLE console_sink (
id BIGINT,
name VARCHAR,
age INT
) WITH (
type='console'
);
-- Write one row to the log
INSERT INTO console_sink
SELECT 1, 'a', 2;