Replicate |
Scroll |
The REPLICATE statement must be the first statement, other than comments, in the configuration script and may occur only once. It indicates the nature and location of the source and the target of the replication. It also indicates the desired level of parallel processing.
The following combination of Source and Target types are currently supported:
Source |
Target |
---|---|
DB2 z/OS |
Kafka, JSON format |
DB2 z/OS |
Kafka, AVRO Confluent |
DB2 z/OS |
HDFS, JSON format |
DB2 z/OS |
HDFS, AVRO Container |
DB2 z/OS |
File, JSON format |
DB2 z/OS |
File, AVRO Container |
IMS z/OS |
Kafka, CDC Raw Format, interim target with Apply Engine Consumer |
Syntax
REPLICATE <source_nature> <source_url>
TO <target_nature> <target_url>
[WITH <number> WORKER[S]];
Keyword and Parameter Descriptions
<source_nature> Only DB2 and IMS Sources currently supported
<source_url> Typical source url: cdc|cdcs://[<host_name_or_address>[:<port_number>]] / <agent_name> / <engine_name>
[<host_name_or_address>[:<port_number>]] Host name and port number of the SQDAEMON servicing the source Capture/Publisher.
<agent_name> Specifies the alias name assigned to the Capture/Publisher agent specified in the sqdagents.cfg file of the SQDAEMON
<engine_name> Specifies the name of the Engine subscribing to the Capture/publisher specified in the capture/publisher configuration .CAB file (see the applicable source Capture Reference). UPPER CASE Engine names recommended for z/OS compatibility.
cdc|cdcs:// Indicates Change Data Capture source where "cdcs" specifies that a secure TLS connection will be made to zOS (ONLY) Daemon, Capture and Publisher components already configured to run under IBM's Application Transparent Transport Layer Security (AT-TLS).
<target_nature> JSON, AVRO CONFLUENT, AVRO CONTAINER or CDC MESSAGE
JSON Each message written contains both JSON schema and data payload
AVRO CONFLUENT Requires OPTIONS CONFLUENT REPOSITORY <registry_url> for registration of AVRO schemas independant of message data payload
AVRO CONTAINER may only be used with file:// or hdfs:// "type" target url's; Requires OPTIONS CONFLUENT REPOSITORY <registry_url> for registration of AVRO schemas independant of HDFS record data payload
CDC MESSAGE can only be used with kafka:// "type" target url's and only for Apply Engine Kafka Consumers
<target_url> target "type" urls:
kafka://[<hostname>[:<port_number>]] / [<topic_name>|<prefix>_*_<suffix>|*]/[<partition>|key]
Note, CDC MESSAGE "nature" targets may NOT specify a partition or key
hdfs://<hostname>[:<port_number>]/<hdfs_file_name>
file://[<relative_path.> | <full_path.>] <file_name>
WITH <number> WORKER[S] Kafka ONLY. Number of Replicator worker threads. Precisely recommends using only one (1) until the Replicator report indicates the level of workload would benefit from additional workers. Multiple Workers are not recommended unless the statistics produced by the Replicator indicate that the worker is busy 100 percent of the time. If adding a worker results in longer wait times, it means the work is merely being spread unnecessarily among them.
In practice, replication to JSON or AVRO are unlikely to benefit from more than one worker because a Kafka target can usually absorb the workload. The extra workers will not improve the bandwidth, just increase overall CPU consumption. Additionally with a single worker the Replicator does not need to keep track of the UOW content. Multiple workers require an additional thread to manage scheduling adding more CPU consumption. Slow target datastores like DB2 may benefit from additional workers.
Notes:
1.The Kafka URL can be simplified by specifying the brokers in the sqdata_kafka_producer.conf file and when AVRO is used with the Confluent Schema Registry, the topic_id and partition will be automatically retrieved from the Schema Registry for each Source Table Reference. The "key" option specifies that the primary key(s), concatenated if necessary, that are defined in the source Db2 catalog, will be used for topic partitioning. This ensures that all rows captured with the same key will be processed in the same partition.
2.Kafka with Confluent managed AVRO schemas requires that the OPTIONS Statement specify the Confluent Schema Repository’s url.
3.HDFS or file with AVRO Container require all source records to be sent to a single file because the container will contain all possible Schema's.
4.WITH <number> WORKERS clause applies only to Kafka targets and will be silently ignored for all others.
5.CDC MESSAGE "nature" target supports only Connect CDC SQData Apply Engine Kafka Consumers.
6.The addition of a new Table to the source Capture/Publisher configuration will by default automatically initiate its Replication. See OPTIONS and MAPPINGS for more information regarding default topics, and ignoring or terminating replication under this condition.