Databases 28 min read

Debezium 2.0.0.Final Release: New Features, Connector Enhancements, and Improvements

Debezium 2.0.0.Final introduces major enhancements such as Java 11 migration, improved incremental snapshot controls, multi‑partition support, new storage modules, pluggable topic naming, expanded connector capabilities for Cassandra, MongoDB, MySQL, Oracle, PostgreSQL and Vitess, plus ARM64 container images and community updates.

Big Data Technology Architecture

Oct 18, 2022

Debezium 2.0.0.Final Release: New Features, Connector Enhancements, and Improvements

We are pleased to announce the official release of Debezium 2.0.0.Final. Since the first 1.0 release in December 2019, the community has built a comprehensive low‑latency change data capture (CDC) platform, adding stable connectors for Oracle, community‑driven Vitess, incremental snapshots, multi‑partition support, and more.

Debezium Core Module Changes

Java 11 Dependency

Debezium now requires Java 11 at runtime, allowing the use of new language features and performance improvements. Users must ensure Java 11 is available before upgrading.

Improved Incremental Snapshot

Stop‑snapshot signal

A new stop-snapshot signal can halt an ongoing incremental snapshot. It is sent by inserting a row into the signal table:

INSERT INTO schema.signal_table (id, type, data)
VALUES ('unique-id', 'stop-snapshot', '_<signal payload>_');

Example payload:

{
  "data-collections": ["schema1.table1", "schema2.table2"],
  "type": "incremental"
}

If data-collections is omitted, the signal stops the entire snapshot:

{
  "type": "incremental"
}

Additional signals pause-snapshot and resume-snapshot allow pausing and resuming snapshots. These can be sent via MySQL tables or Kafka topics.

Regular expressions are now supported in the data-collections field, e.g.:

{
  "data-collections": ["schema[1|2].table[1|2]"],
  "type": "incremental"
}

A new additional-condition attribute lets users filter rows with an SQL predicate. Example limiting a snapshot to product_id = 12:

{
  "type": "execute-snapshot",
  "data": {
    "data-collections": ["inventory.products"],
    "type": "INCREMENTAL",
    "additional-condition": "product_id=12"
  }
}

Debezium now automatically adds signal tables/collections to the connector’s table.include.list, removing the need for manual configuration.

Transaction Metadata Changes

BEGIN and END events now include a new ts_ms field with the database timestamp. Example:

{
  "status": "END",
  "id": "12345",
  "event_count": 2,
  "ts_ms": "1657033173441",
  "data_collections": [
    {"data_collection": "s1.a", "event_count": 1},
    {"data_collection": "s2.a", "event_count": 1}
  ]
}

Enable this feature by setting provider.transaction.metadata=true and configure the transaction topic as needed.

Multi‑Partition Mode

The former database.dbname option is replaced by database.names, a comma‑separated list of database names, enabling a single connector to capture changes from multiple databases (e.g., SQL Server). Each database runs as a separate task.

JMX metric names now include a task tag, e.g.:

debezium.sql_server:type=connector-metrics,server=<sqlserver.server.name>,task=<task.id>,context=<context>

New Storage Modules

Debezium‑storage modules are introduced for file‑based and Kafka‑based schema history and offset storage, paving the way for future support of Amazon S3, Redis, JDBC, etc.

Pluggable Topic Naming Strategy

A new TopicNamingStrategy allows full customization of topic names. Users can specify a custom class via topic.naming.strategy (e.g., org.myorganization.MyCustomTopicNamingStrategy).

Unique Index Handling

Indexes that rely on hidden columns (e.g., PostgreSQL CTID, Oracle ROWID) or functions are no longer eligible as primary‑key substitutes.

New Configuration Namespace

Many connector properties have been renamed: database.history → schema.history.internal, JDBC‑specific options now use driver. prefix, database.server.name → topic.prefix, and MongoDB mongodb.name aligns with topic.prefix.

All Schemas Named and Versioned

Every schema definition now has an explicit name and version, improving compatibility with Schema Registry.

Default Skip Truncate Events

If a connector supports truncate events, they are now skipped by default unless skipped.operations=none is set.

Schema Name Adjustment Behavior

The schema.name.adjustment.mode property now defaults to none (previously avro), controlling how non‑Avro‑compatible characters are handled.

Cassandra Connector Changes

Cassandra 4 Incremental Commit Log Support

Debezium now uses Cassandra 4’s CDC index file to eliminate latency when processing CDC events.

MongoDB Connector Changes

Removal of Oplog Implementation

The older oplog‑based approach is dropped; MongoDB 3.x is no longer supported. Change streams are now the default CDC method.

Before‑State Support (MongoDB 6.0+)

New capture modes change_streams_with_pre_image and change_streams_update_full_with_pre_image include the document state before a change.

MySQL Connector Changes

Removal of Legacy MySQL Implementation

The legacy connector has been removed; only the new public connector remains.

Binlog Compression Support

Debezium can now read ZSTD‑compressed binlog events when binlog.transaction_compression=on is set.

Oracle Connector Changes

Source Info Enhancements

The source.scn field now correctly reflects the originating LogMiner or XStream SCN. New fields rs_id, ssn, and redo_thread provide richer RAC context.

Offset Structure Update

Offsets now store a comma‑separated list of scn:rollback-segment-id:ssn:redo-thread tuples, e.g.:

{
  "scn": "1234567890:00124.234567890.1234:0:1,1234567891:42100.0987656432.4321:0:2",
  "commit_scn": "2345678901",
  "lcr_position": null,
  "txId": null
}

A new user_name field records the database user that performed the change.

PostgreSQL Connector Changes

Removal of wal2json Support

Support for the wal2json decoder has been dropped; users must upgrade to PostgreSQL 10+ and use pgoutput or decoderbufs.

Vitess Connector Changes

Multi‑Task Support

The connector now automatically discovers shards and distributes them across multiple tasks, enabling a single deployment to handle many shards.

Debezium Container Image Changes

ARM64 Support

Official ARM64 container images are now published alongside the traditional amd64 images, reducing overhead on ARM‑based platforms.

Debezium Community Spaces

New Zulip spaces are being added for database‑specific discussions, complementing the existing #users channel.

Other Fixes and Improvements

Debezium 2.0 contains 463 fixes and enhancements contributed by a large community of developers.

Future Roadmap

Work on Debezium 2.1 is underway, with planned features such as MySQL truncate event support, PostgreSQL 15 support, and JDBC history/offset storage.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Change Data Capture Debezium Java 11 Database Connectors Incremental Snapshot

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.