Deploying Azure Data Factory, Azure Data bricks, Azure Data Lake storage & MySql DB using Terraform
Azure
Terraform
DevOps
Data Engineering
A comprehensive guide to deploying Azure data services using Terraform
Here I am going to share some terraform code to deploy ADF, ADLS, ADB, and several other necessary resources.
Table of Contents
- Deploying Azure Data Factory, Azure Data bricks, Azure Data Lake storage & MySql DB using Terraform
- Table of Contents
- Resource Group
- Azure Data Factory:
- Azure Data Bricks:
- Virtual network:
- Network Security group for ADB:
- Public subnet for Databricks:
- Private subnet for Databricks:
- Network security group for Public subnet:
- Network security group for Privatesubnet:
- Data Lake storage account:
- Storage account container:
- Storage Admin password:
- SQL server :
- SQL Database:
Let’s start with a resource group where we will store all the resources required.
Resource Group
data "azurerm_client_config" "Current" {}
resource "azurerm_resource_group" "RG" {
= var.ResourceGroup.Name
name = var.ResourceGroup.Location
location }
points to note that we will fetch the RG name and RG location in the next resource declaration.
Azure Data Factory:
resource "azurerm_data_factory" "DataFactory" {
= "DataFactory Name"
name = azurerm_resource_group.RG.location
location = azurerm_resource_group.RG.name
resource_group_name
identity {= "SystemAssigned"
type
} }
Azure Data Bricks:
resource "azurerm_databricks_workspace" "Databricks" {
= azurerm_resource_group.RG.location
location = "Databricks Name"
name = azurerm_resource_group.RG.name
resource_group_name = "Databricks Managed Resource Group"
managed_resource_group_name = "Databricks Sku"
sku
custom_parameters {= true
no_public_ip = azurerm_virtual_network.DatabricksVnet.id
virtual_network_id = azurerm_subnet.DatabricksSubnetPublic.name
public_subnet_name = azurerm_subnet.DatabricksSubnetPrivate.name
private_subnet_name
}
depends_on = [
azurerm_subnet_network_security_group_association.public,
azurerm_subnet_network_security_group_association.private
] }
Virtual network:
resource "azurerm_virtual_network" "DatabricksVnet" {
= "VNET NAME"
name = azurerm_resource_group.RG.name
resource_group_name = azurerm_resource_group.RG.location
location = ["VNET CIDR"]
address_space }
Network Security group for ADB:
resource "azurerm_network_security_group" "DatabricksNSG" {
name = "VirtualNetwork NSG Name"
resource_group_name = azurerm_resource_group.RG.name
location = azurerm_resource_group.RG.location
}
Public subnet for Databricks:
resource "azurerm_subnet" "DatabricksSubnetPublic" {
name = "VirtualNetwork PublicSubnet Name"
resource_group_name = azurerm_resource_group.RG.name
virtual_network_name = azurerm_virtual_network.DatabricksVnet.name
address_prefixes = ["VirtualNetwork PublicSubnet CIDR"]
service_endpoints = ["Microsoft.Storage"]
delegation {
name = "Microsoft.Databricks.workspaces"
service_delegation {
name = "Microsoft.Databricks/workspaces"
actions = [
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
"Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
}
}
}
Private subnet for Databricks:
resource "azurerm_subnet" "DatabricksSubnetPrivate" {
name = "VirtualNetwork PrivateSubnet Name"
resource_group_name = azurerm_resource_group.RG.name
virtual_network_name = azurerm_virtual_network.DatabricksVnet.name
address_prefixes = ["VirtualNetwork PrivateSubnet CIDR"]
delegation {
name = "Microsoft.Databricks.workspaces"
service_delegation {
name = "Microsoft.Databricks/workspaces"
actions = [
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
"Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"]
}
}
}
Network security group for Public subnet:
resource "azurerm_subnet_network_security_group_association" "public" {
subnet_id = azurerm_subnet.DatabricksSubnetPublic.id
network_security_group_id = azurerm_network_security_group.DatabricksNSG.id
}
Network security group for Privatesubnet:
resource "azurerm_subnet_network_security_group_association" "private" {
subnet_id = azurerm_subnet.DatabricksSubnetPrivate.id
network_security_group_id = azurerm_network_security_group.DatabricksNSG.id
}
Now as all the associated network configuration done let’s move to the DATA LAKE STORAGE account creation
Data Lake storage account:
resource "azurerm_storage_account" "DataLake" {
name = "DataLake Name"
resource_group_name = azurerm_resource_group.RG.name
location = azurerm_resource_group.RG.location
account_tier = "DataLake Tier"
account_replication_type = "DataLake Replication"
is_hns_enabled = true
min_tls_version = "DataLake TLSVersion"
network_rules {
# bypass = "AzureServices"
default_action = "Allow"
}
}
Storage account container:
resource "azurerm_storage_container" "DataLakeContainer" {
for_each = "DataLake Container"
name = each.key
storage_account_name = azurerm_storage_account.DataLake.name
container_access_type = "private"
}
Now, let us create SQL related resources
Storage Admin password:
resource "random_string" "SQLAdminPassword" {
length = 5
special = true
min_upper = 2
min_numeric = 2
min_special = 2
}
SQL server :
resource "azurerm_mssql_server" "SQLServer" {
name = "SQLServer Name"
resource_group_name = azurerm_resource_group.RG.name
location = azurerm_resource_group.RG.location
version = "SQLServer Version"
administrator_login = "SQLServer AdministratorLogin"
administrator_login_password = random_string.SQLAdminPassword.result
minimum_tls_version = "SQLServer TLS Version"
}
SQL Database:
resource "azurerm_mssql_database" "SQLDatabase" {
name = "SQLDatabase Name"
server_id = azurerm_mssql_server.SQLServer.id
collation = "SQL_collation"
max_size_gb = "SQLDatabase MaxSizeGB"
sku_name = "SQLDatabase SKU"
zone_redundant = "SQLDatabase ZoneRedundant"
}
This is a complete part by part snippets to create a running ADB ADF system, feel free to reach me in case any clarification required! ## Read my blogs: