Deploying Azure SQL Database Bacpac and Terraform

This post will take a look at how to deploy a bacpac to Azure SQL Database with Terraform. This post builds on my previous post “Deploying Azure SQL Database with Terraform“, it it probably worth giving that a quick read before going through this post.

Assumptions

I’m going to be making the following assumptions about your setup so that if you are following along you won’t wonder why something might not be working quite right.

  1. You already have an Azure subscription with sufficient permissions to deploy a resource group and resources within it.
  2. You have a scripting IDE for writing the HCL (I’m using VSCode with the Azure Terraform, and Hashicorp Terraform extensions).
  3. You have downloaded and configured Terraform so that it is in the PATH environment variable.
  4. You have installed the Azure CLIlogged in, and set your default subscription.
  5. You have already created a bacpac file from your source database, staged it in Azure blob storage, and generated a SAS token.

Code

All of the code for this post can be found in this GitHub repository.

What are we doing today?

It’s all well and good deploying Azure SQL Database resources as we did in the previous post. However, databases tend to work a little better with a schema and some data in them. One of the options for getting data from an on-premises SQL Server database into Azure SQL Database is via a bacpac. This is, at its core, an export of the schema and data into a single file which is then created and loaded to Azure SQL Database. Much the same as a MySQL dump operates.

There are two main routes to achieving this end goal. One is to deploy and empty database and then deploy the bacpac via SqlPackage, the other is to include the bacpac as the source for the database created in Azure. It is the latter process we will be looking at in this post.

Terraform Project Layout

The code for this project is largely based on that used for the previous post. However, we will be making some modifications to account for the activity we are now performing. For the purposes of this post and making it easier to follow I am breaking the database creation out into its own HCL file and removing it from the main.tf.

The key difference between the prior deployment and this one is that we need to setup the firewall on the Azure SQL Server. If we don’t then the deployment from the bacpac in blob storage will fail. The configuration that we need to enable is “Allow Azure services and resources to access this server”, this allows the two Azure services to communicate.

Setting the Azure SQL Server Firewall

Within the AzureRM provider for Terraform there is a resource type of azurerm_sql_firewall_rule which we can use to make this configuration. Normally for the firewall configuration there we need to specify the start and end IP addresses in a range. However, for this configuration we set them both to 0.0.0.0. This is then interpreted by Azure to mean that we want a firewall rule to allow Azure services.

resource "azurerm_sql_firewall_rule" "allowAzureServices" {
  name                = "Allow_Azure_Services"
  resource_group_name = azurerm_sql_server.sql01.resource_group_name
  server_name         = azurerm_sql_server.sql01.name
  start_ip_address    = "0.0.0.0"
  end_ip_address      = "0.0.0.0"
}

While I normally use variables to set most resource properties, in this case it is a special situation and I am going to provide values in the resource definition as they are specific. Because we need to have an Azure SQL Server in place before we can create this rule we are going to get the values for the resource group name and server name from the azurerm_sql_server resource. By doing this we are creating the implicit dependency between the resources so that Terraform can schedule creation of them appropriately.

Defining the Database Resource

In the previous post we used the azurerm_mssql_database resource type, today we will be using the azurerm_sql_database resource. This is because the deployment of a bacpac is only supported through this resource type. If we wanted to continue using the azurerm_mssql_database then we would need to look at the two stage deployment model.

The resource definition here is comprised of two main sections, the first being the details around where the database needs to go. The second part being a sub-block which defines the bacpac source details. This is where we need to put in the URI for the bacpac file and the storage key, in this case we are using the SAS token for the key to allow access to the bacpac.

We also need to provide the username and password for the server we are creating to allow the import to work because it needs to have authorisation to the Azure SQL Server to work. In this situation we can use the same variables used to define those when the server is created.

resource "azurerm_sql_database" "appdb01" {
  depends_on                       = [azurerm_sql_firewall_rule.allowAzureServices]
  name                             = var.AzSqlDbName
  resource_group_name              = azurerm_sql_server.sql01.resource_group_name
  location                         = azurerm_sql_server.sql01.location
  server_name                      = azurerm_sql_server.sql01.name
  collation                        = var.AzSqlDbCollation
  max_size_gb                      = var.AzSqlDbMaxSizeGb
  requested_service_objective_name = var.AzSqlDbSkuName

  create_mode = "Default"
  import {
    storage_uri                  = var.BacpacStorageUri
    storage_key                  = var.BacpacStorageKey
    storage_key_type             = var.BacpackStorageKeyType
    administrator_login          = var.AzSqlServerAdminAccountName
    administrator_login_password = var.AzSqlServerAdminAccountPassword
    authentication_type          = var.ServerAuthAccessType
    operation_mode               = "Import"
  }

  tags = local.tags
}

One key difference here to other resource definitions is that we are making an explicit dependency between the database resource and the firewall resource. In situations like this where there is a functional dependency between two objects that are not intrinsically linked I prefer an explicit dependency. This is mainly so that the order of resource creation is clear so that if one of our colleagues has to troubleshoot this that they have key information to hand. The result of this is that the dependency graph looks as follows.

Resource graph showing dependency between defined resources.
Figure 1. Project Resource Graph.

This clearly shows that the server resource is created and that the firewall depends on it. The database then depends on that firewall rule, as well as the variables, before it can be deployed. This is a great example of how the graph output in Terraform can be used to visualise potential deployment bottlenecks in our projects.

Deployment

Now that we have our resources defined we can deploy them to azure. Here we are going to use the Terraform Apply command, however to streamline things a little we are going to append the -auto-approve switch so that we are not prompted to confirm. This is really useful if we are looking to automate the deployment via Azure DevOps or similar.

Animated gif showing the creation of resources defined in this post in Microsoft Azure.
Figure 2. Deployment of bacpac to Azure SQL Database.

You will notice from Figure 2. above that the deployment process takes a lot longer than just creating an empty Azure SQL Database. In this example I am using the AdventureWorks2019 database as the source. This represents ~1GB of data which needs to be deployed into the Azure SQL Database. The time take here for deployment is constrained by the size of the database which we are deploying and the performance configuration of the database. We deployed this to an S0 database for this example, had we picked a higher performance tier then it would have been faster. However, if we over-sized the database for deployment optimisation then we would need to do a post-deployment update to the system to reduce the scale of the system to where we needed it.

Summary

So, here we have looked at one of the options for getting our database into Azure using Terraform and a bacpac. It is a useful way to deploy a database with a schema and data all ready to go and no need for subsequent deployment activity to get going. Would I use this as a migration methodology? It depends, mainly on how large the database is because of the way that it loads the data and the potential time taken for larger databases. For smaller ones, I would probably consider it and test it out.

The real benefit of this process is for non-production development and test environments where a replica of live is needed. It is possible to automate a regular refresh of the bacpac file with an obfuscated and/or cut down copy of live so that development teams have something solid to work with.

I would be really interested to see what people come up with to use this, feel free to post any thoughts and feedback in the comments below.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.