Terraform - Azure HDInsight HBase accelerated disk write support
Originally published on an external platform.
There has been a long-standing request in the community to support HBase Accelerated Disk Write in the terraform-provider-azurerm.
- GitHub Issue: Accelerated disk write #9142
According to Azure APIs, specific flags must be set to enable accelerated writes:
{
"properties": {
"computeProfile": {
"roles": [
{
"name": "workernode",
"dataDisksGroups": [
{
"disksPerNode": 1
}
]
}
]
}
}
}
The Problem
In the Terraform provider’s schema (specifically hdinsight/schema.go), the HDInsightNodeDefinition struct defines whether disks can be specified:
type HDInsightNodeDefinition struct {
CanSpecifyInstanceCount bool
MinInstanceCount int
MaxInstanceCount *int
CanSpecifyDisks bool
MaxNumberOfDisksPerNode *int
FixedMinInstanceCount *int32
FixedTargetInstanceCount *int32
CanAutoScaleByCapacity bool
CanAutoScaleOnSchedule bool
}
However, the plugin code had a hardcoded definition for hdInsightHBaseClusterWorkerNodeDefinition that disabled disk specification:
var hdInsightHBaseClusterWorkerNodeDefinition = HDInsightNodeDefinition{
CanSpecifyInstanceCount: true,
MinInstanceCount: 1,
CanSpecifyDisks: false, // This was the blocker
CanAutoScaleOnSchedule: true,
}
The Solution
To enable this feature, I modified the definition to allow disks and set the required parameters:
var hdInsightHBaseClusterWorkerNodeDefinitionWithAcceleratedWrites = HDInsightNodeDefinition{
CanSpecifyInstanceCount: true,
MinInstanceCount: 1,
CanSpecifyDisks: true, // Enabled
CanAutoScaleOnSchedule: true,
MaxNumberOfDisksPerNode: utils.Int(1),
}
I also added a new configuration parameter enable_accelerated_writes to the Terraform plugin schema to give users the choice:
"enable_accelerated_writes": {
Type: pluginsdk.TypeBool,
Optional: true,
ForceNew: true, // The resource will be re-created if this changes
Default: false,
},
Then, I implemented logic to decide which node definition to use based on that flag:
func decideHDInsightNodeDefinition(enableWrites bool) hdInsightRoleDefinition {
var hbaseRoles hdInsightRoleDefinition
if enableWrites {
hbaseRoles = hdInsightRoleDefinition{
HeadNodeDef: hdInsightHBaseClusterHeadNodeDefinition,
WorkerNodeDef: hdInsightHBaseClusterWorkerNodeDefinitionWithAcceleratedWrites,
ZookeeperNodeDef: hdInsightHBaseClusterZookeeperNodeDefinition,
}
} else {
hbaseRoles = hdInsightRoleDefinition{
HeadNodeDef: hdInsightHBaseClusterHeadNodeDefinition,
WorkerNodeDef: hdInsightHBaseClusterWorkerNodeDefinition,
ZookeeperNodeDef: hdInsightHBaseClusterZookeeperNodeDefinition,
}
}
return hbaseRoles
}
Usage
After these changes, I was able to create an HBase Cluster with accelerated disk writes using standard Terraform HCL:
resource "azurerm_hdinsight_hbase_cluster" "example" {
name = "example-hdicluster"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
cluster_version = "3.6"
enable_accelerated_writes = true
tier = "Standard"
component_version {}
gateway {}
storage_account {}
roles {
head_node {}
worker_node {}
zookeeper_node {}
}
}
You can find my forked version of the Azure provider here: tfproviders/terraform-provider-azurerm.