Consume Kafka topics using Azure Databricks (Spark), Confluent Cloud (Kafka) running on Azure, Schema Registry and AVRO format

1 min readOct 8, 2020

This post will provide the Sample code (Python) to consume Kafka topics using Azure Databricks (Spark), Confluent Cloud (Kafka) running on Azure, Schema Registry and AVRO format.

Reading the topic:

Stream Data formatted and stored in a Spark SQL Table (view):

Source code:
https://github.com/caiomsouza/microsoft-big-data-scientist-and-ai/blob/master/samples/azure-ccloud-databricks/SampleCodeConfluentCloudKafkaAvroPython_CM_08102020_WithoutCred.py

Docs:
https://azure.microsoft.com/en-us/services/databricks/
https://docs.databricks.com/spark/latest/structured-streaming/avro-dataframe.html#example-with-schema-registry
https://docs.confluent.io/current/cloud/cloud-start.html#cloud-start

Special thanks and credits to Angela Chu, Gianluca Natali, Henning Kropp, Yatharth Gupta, Bhanu Prakash, Awez Syed, Nick Hill, Robin Davidson, Liping Huang, Chris Munyasya, Sid Rabindran and many more people from the Databricks, Confluent and Microsoft team engaged to make this integration to work.

Consume Kafka topics using Azure Databricks (Spark), Confluent Cloud (Kafka) running on Azure, Schema Registry and AVRO format

Written by Caio Moreno

No responses yet