Skip to content

osodevops/terraform-azure-confluent-platform

Repository files navigation

Confluent Platform on Azure

README Header

Terraform Module for deploying best practice HA Confluent Platform on Azure


Usage

Overview

This module provides the ability to deploy the entire confluent suite on Azure with three simple commands. It achieves this by leveraging Terraform to build out the Azure infrastructure. Within this infrastructure exists a container group which runs the docker image osodevops/cp-ansible which is used to provision the confluent virtual machines. This solution is not intended as a hardened production environment but rather provides a way to get running with Confluent on Azure QUICKLY.

The code here consists of a Terraform modules together with a set of Ansible roles provided by Confluent to install and configure Confluent Platform.

Diagram

solution_diagram

Examples

Getting Started

Requirements

  • Terraform, please see here
  • Azure-cli, please see here

Pre-Deployment Tasks

Generate SSH keys for virtual machines
  • From the root of the project, run ./ssh-generation.sh this will populate keys through the code base which will be used for remote access onto the Confluent servers
  • Keep hold of the newly created ./modules/resource_group/oso-confluent.ssh key, this is the key you will use to SSH onto the VMs.
Create storage account for Terraform state
  • Sign in with Azure CLI (az login)
  • Execute ./state_generation.sh to create a standalone resource group and storage account to be used for terraform state. If you change any of the values in this script, you will need to update the backend.tf files accordingly.

Terraform Deployment

Terraform is used to provision all required Azure resources, the deployment has been split up into 2 parts:

Shared:

  - [x] Private Virtual network.
  - [x] Private and public subnets.
  - [x] NAT gateway.
  - [x] Private DNS zone.
  - [x] Bastion Server with public IP.
  - [x] Container service for cp-ansible provisioning.

Confluent:

  - [x] Zookeeper VM with network interface and data disk.
  - [x] Broker VM with network interface and data disk.
  - [x] Schema Registry VM with network interface and data disk.
  - [x] Kafka Connect VM with network interface and data disk.
  - [x] KSQL VM with network interface and data disk.
  - [x] Rest Proxy VM with network interface and data disk.
  - [x] Confluent Control Centre VM with network interface and data disk.
  - [x] Public IPs for Control Center and Rest Proxy.

Shared Resource Deployments

To deploy from local, navigate to ./examples/production/shared, and run terraform init -backend-config=backend.hcl && terraform plan. If you are happy with the output, you can run terraform apply

Confluent deployment

After the shared resource groups have successfully deployed, you can deploy the confluent VMs. To do so, navigate to ./examples/production/confluent-platform, and run terraform init -backend-config=backend.hcl && terraform plan. If you are happy with the output, you can run terraform apply

Platform configuration

All properties/configurations/hostnames for the cluster are stored in the file ./modules/resource_group/ansible-inventory.yml. To activate changes made to that file, perform the following operations:

  • Change ./modules/resource_group/ansible-inventory.yml as desired
  • Deploy inventory into the Azure storage account by navigating to ./production/resource_group, and running terraform apply

Ansible Deployment

Command Line

The terraform deployment deploys an Azure container group into a private subnet which has the ability to provision the newly created VMs with cp-ansible. If you have made any alterations -- prefix, environment, additional instances, etc., you will need to update ./resource_group/ansible-inventory.yml to reflect this. Presently the inventory is working on the assumption of a single instance, so should, for example you wish to have 3 zookeeper instances, you would need to add zookeeper-2.confluent.internal: and zookeeper-3.confluent.internal: to this file otherwise cp-ansible will not attempt to provision these VMs

To run this container:

  $ ./run-ansible.sh

This process should take approximately 25 mins to complete. The complete process will output:

  Saturday 13 August 2022  07:39:24 +0000 (0:00:00.609)       0:14:48.993 *******
  ===============================================================================
  confluent.platform.control_center : Install the Control Center Packages - 131.84s
  confluent.platform.kafka_rest : Install the Kafka Rest Packages ------- 131.42s
  confluent.platform.schema_registry : Install the Schema Registry Packages -- 84.57s
  confluent.platform.kafka_broker : Install the Kafka Broker Packages ---- 79.60s
  confluent.platform.common : Install Java ------------------------------- 71.13s
  confluent.platform.zookeeper : Install the Zookeeper Packages ---------- 32.61s
  confluent.platform.control_center : Startup Delay ---------------------- 30.50s
  confluent.platform.kafka_broker : Startup Delay ------------------------ 20.42s
  confluent.platform.schema_registry : Startup Delay --------------------- 15.41s
  confluent.platform.kafka_rest : Startup Delay -------------------------- 15.41s
  confluent.platform.common : Add Max Size Properties --------------------- 8.08s
  confluent.platform.kafka_broker : Create Kafka Broker Config ------------ 6.91s
  confluent.platform.zookeeper : Startup Delay ---------------------------- 5.93s
  confluent.platform.kafka_broker : Set Permissions on Data Dir files ----- 5.30s
  confluent.platform.kafka_broker : Set Permissions on Data Dirs ---------- 5.21s
  confluent.platform.common : Gather OS Facts ----------------------------- 5.21s
  confluent.platform.control_center : Create Control Center Config -------- 4.41s
  confluent.platform.common : yum-clean-all ------------------------------- 4.32s
  confluent.platform.kafka_rest : Create Kafka Rest Config ---------------- 4.32s
  confluent.platform.common : Add Confluent Repo file --------------------- 4.17s

Azure Console (alternative deployment method)

Alternatively, aia the Azure Console, simply find the container group named oso-devops-cp-ansible in the click the 'Start' button:

container

Testing the deployment

Once the Ansible installer has completed you can test deployment in the following ways:

Control Centre

Access the Control Centre using the public ip: Print the IP using terraform output -raw control_center_ip open your browser and goto: http://<<control_center_ip>>:9021 NOTE: If you have enabled SSL, using https

Rest Proxy

Using the publicly accessible Rest Proxy we can post a test message which will in turn create a topic and produce a message using the following command:

  # Produce a message using binary embedded data with value "Kafka" to the topic binarytest
  curl -X POST -H "Content-Type: application/vnd.kafka.binary.v2+json" \
    -H "Accept: application/vnd.kafka.v2+json" \
    --data '{"records":[{"value":"S2Fma2E="}]}' "http://<<rest-proxy-ip>>:8082/topics/binarytest"

  # Expected output from preceding command:
  {"offsets":[{"partition":0,"offset":0,"error_code":null,"error":null}],"key_schema_id":null,"value_schema_id":null}

SSH Connectivity to check the logs

All VM's have been created using the SSH generated at the beginning, you can now use the Azure Bastion Service with the .pem file located in ./modules/resource_group/<<key_name>>.pem

Debugging Ansible

As ansible is run from a container within the Azure network, we need away to debug when things aren't working as expected. To provide this ability, we simply need to uncomment out the commands = ["sleep", "100000"] on resource "azurerm_container_group" "ansible" found at ./modules/resource_group/ansible-container.tf (and then deploy these changes). Once this is done, you will be able to exec onto this container from the Azure Console, and run ansible manually, or tweak configuration/code in-place.

Additional Environments

By using terragrunt's DRY approach, creating additional environments is very straight forward. Simply copy the entire production folder to a new folder (i.e named staging), and you will be able to deploy in the same manner as production (The deployments are folder name aware).

Related Projects

Check out these related projects.

Need some help

File a GitHub issue, send us an email or tweet us.

The legals

Copyright © 2017-2022 OSO | See LICENCE for full details.

OSO who we are

Who we are

We at OSO help teams to adopt emerging technologies and solutions to boost their competitiveness, operational excellence and introduce meaningful innovations that drive real business growth. Our developer-first culture, combined with our cross-industry experience and battle-tested delivery methods allow us to implement the most impactful solutions for your business.

Looking for support applying emerging technologies in your business? We’d love to hear from you, get in touch by email

Start adopting new technologies by checking out our other projects, follow us on twitter, join our team of leaders and challengers, or contact us to find the right technology to support your business.Beacon