Protobuf Schema
Create A Schema
Currently Proton supports reading or writing messages in Protobuf format. For example:
CREATE OR REPLACE FORMAT SCHEMA schema_name AS '
syntax = "proto3";
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 results_per_page = 3;
}
' TYPE Protobuf
Then refer to this schema while creating an external stream for Confluent Cloud or Apache Kafka:
CREATE EXTERNAL STREAM stream_name(
query string,
page_number int32,
results_per_page int32)
SETTINGS type='kafka',
brokers='pkc-1234.us-west-2.aws.confluent.cloud:9092',
topic='topic_name',
security_protocol='SASL_SSL',
username='..',
password='..',
data_format='ProtobufSingle',
format_schema='schema_name:SearchRequest'
Please note:
- If you want to ensure there is only a single Protobuf message per Kafka message, please set
data_format
toProtobufSingle
. If you set it toProtobuf
, then there could be multiple Protobuf messages in a single Kafka message. - The
format_schema
setting contains two parts: the registered schema name (in this example: schema_name), and the message type (in this example: SearchRequest). Combining them together with a semicolon. - You can use this external stream to read or write Protobuf messages in the target Kafka/Confluent topics.
- For more advanced use cases, please check the examples for complex schema.
List Schemas
List schemas in the current Proton deployment:
SHOW FORMAT SCHEMAS
Show Details For A Schema
SHOW CREATE FORMAT SCHEMA schema_name
Drop A Schema
DROP FORMAT SCHEMA <IF EXISTS> schema_name;
Examples For Complex Schema
Nested Schema
CREATE FORMAT SCHEMA simple_nested AS '
syntax = "proto3"
message Name {
string first = 1;
string last = 2;
}
message Person {
string email = 1;
Name name = 2;
int32 age = 3;
map<string, int32> skills = 4;
}
' TYPE Protobuf
CREATE EXTERNAL STREAM people(
email string,
name_first string,
name.last string,
skills map(string,int32),
age int32
)
SETTINGS type='kafka'.. data_format='ProtobufSingle',
format_schema='simple_nested:Person'
Please note:
Person
is the top level message type. It refers to theName
message type.- Use
name
as the prefix as the column names. Use either _ or . to connect the prefix with the nested field names. - When you create an external stream to read the Protobuf messages, you don't have to define all possible columns. Only the columns you defined will be read. Other columns/fields are skipped.
Enum
Say in your Protobuf definition, there is a enum type:
enum Level {
LevelOne = 0;
LevelTwo = 1;
}
You can use the enum type in Proton, e.g.
CREATE EXTERNAL STREAM ..(
..
level enum8('LevelOne'=0,'LevelTwo'=1),
..
)
Repeat
Say in your Protobuf definition, there is a repeated type:
repeated string Status
You can use the array type in Proton, e.g.
CREATE EXTERNAL STREAM ..(
..
status array(string),
..
)
Package
Say in your Protobuf definition, there is a package:
package demo;
message StockRecord {
..
}
If there is only 1 package in the Protobuf definition type, you don't have to include the package name. For example:
CREATE EXTERNAL STREAM ..(
..
)
SETTINGS .. format_schema="schema_name:StockRecord"
If there are multiple packages, you can use the fully qualified name with package, e.g.
CREATE EXTERNAL STREAM ..(
..
)
SETTINGS .. format_schema="schema_name:demo.StockRecord"
Import Schemas
If you have used CREATE FORMAT SCHEMA to register a format schema, say schema_name
, you can create the other schema and import this:
CREATE FORMAT SCHEMA import_example AS '
import "schema_name.proto";
message Test{
required string ID = 1;
optional Level TheLevel = 2;
}
' TYPE Protobuf
Please make sure to add .proto
as the suffix.