Skip to content

In previous post I used Terraform to provision a managed version of MariaDB (AWS RDS for MariaDB). There exist various managed version of MariaDB on the major cloud providers : AWS, Azure, Alibaba Cloud. All of these versions offer a simplification to rapidly deploy and operate MariaDB. You benefit from easy setup including High availability and backup policies. This is nice But what if you want to fully control your MariaDB setup, configuration, and operation procedures.

A pragmatic approach is to use a Docker containerized version of MariaDB and to run it on a kubernetes cluster. Using Kubernetes as an orchestrator and MariaDB Docker images as building blocks is a good choice to remain cloud agnostic and to be prepared for multicloud architectures.

But Kubernetes can be a complex beast. Installing kubernetes, keeping it highly available and maintaining it up to date can consume a lot of time/ressources.

It makes a lot of sense to use a managed version of kubernetes like Azure Kubernetes Service(AKS). It is economically interesting as with AKS we do not pay for the kubernetes control plane. Kubernetes itself is a standard building block that offers the same features on all cloud providers. Kubernetes is a graduated project of the Cloud Native computing foundation and this is a strong guaranty of interoperability across different cloud providers.

First let us login to Azure and set the SUBSCRIPTION_ID to the subscription we want to use to create the principal.

$ az login
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code XXX to authenticate.
[
  {
    "cloudName": "AzureCloud",
    "id": "07ea84f6-0ca4-4f66-ad4c-f145fdb9fc39",
    "isDefault": true,
    "name": "Visual Studio Ultimate",
    "state": "Enabled",
    "tenantId": "e1be9ce1-c3be-436c-af1d-46ffecf854cd",
    "user": {
      "name": "serge@gmail.com",
      "type": "user"
    }
  }
]
$ az account show --query "{subscriptionId:id, tenantId:tenantId}"
{
  "subscriptionId": "07ea84f6-0ca4-4f66-ad4c-f145fdb9fc39",
  "tenantId": "e1be9ce1-c3be-436c-af1d-46feecf854cd"
}
$ export SUBSCRIPTION_ID="07ea84f6-0ca4-4f66-ad4c-f145fdb9fc39"

We first need to create an Azure AD service principal. This is the service account that will be used by Terraform to create resources in Azure.

$ az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/${SUBSCRIPTION_ID}"
{
  "appId": "7b6c7430-1d17-49b8-80be-31b087a3b75c",
  "displayName": "azure-cli-2018-11-09-12-15-59",
  "name": "http://azure-cli-2018-11-09-12-15-59",
  "password": "7957f6ca-d33f-4fa3-993a-8fb018c10fe2",
  "tenant": "e1be9ce1-c3be-436c-af1d-46feecf854cd"
}

We also create an azure storage container to store the Terraform state file. This is required to have multiple devops working on the same infrastructure through a shared storage configuration.

az storage container create -n tfstate --account-name=‘csb07ea84f60ca4x4f66xad4’ --account-key=‘Wo27SNUV3Bkanod424uHStXjlch97lQUMssKpIENytbYsUjU3jscz9SrVogO1gClUZd3qvZs/UwBPqX0gJOU8Q==‘

We now define a set of variables that will be used by terraform to provision our resources.

# variables.tf
 
variable "client_id" {}
variable "client_secret" {}
variable "agent_count" {
    default = 1
}
variable "ssh_public_key" {
    default = "~/.ssh/id_rsa.pub"
}
variable "dns_prefix" {
    default = "k8stest"
}
variable cluster_name {
    default = "k8stest"
}
variable resource_group_name {
    default = "azure-k8stest"
}
variable location {
    default = "Central US"
}

We can now define the terraform main.tf file.

# main.tf
 
provider "azurerm" {
    version = "~>1.5"
}
 
terraform {
#    backend "azurerm" {}
}
 
provider "kubernetes" {
  host                   = "${azurerm_kubernetes_cluster.k8s.kube_config.0.host}"
  client_certificate     = "${base64decode(azurerm_kubernetes_cluster.k8s.kube_config.0.client_certificate)}"
  client_key             = "${base64decode(azurerm_kubernetes_cluster.k8s.kube_config.0.client_key)}"
  cluster_ca_certificate = "${base64decode(azurerm_kubernetes_cluster.k8s.kube_config.0.cluster_ca_certificate)}"
}

we can now describe the managed kubernetes cluster we want to create

# k8s.tf
 
resource "azurerm_resource_group" "k8s" {
    name     = "${var.resource_group_name}"
    location = "${var.location}"
}
 
resource "azurerm_kubernetes_cluster" "k8s" {
    name                = "${var.cluster_name}"
    location            = "${azurerm_resource_group.k8s.location}"
    resource_group_name = "${azurerm_resource_group.k8s.name}"
    dns_prefix          = "${var.dns_prefix}"
 
    linux_profile {
        admin_username = "ubuntu"
 
        ssh_key {
        key_data = "${file("${var.ssh_public_key}")}"
        }
    }
 
    agent_pool_profile {
        name            = "default"
        count           = "${var.agent_count}"
        vm_size         = "Standard_DS2_v2"
        os_type         = "Linux"
        os_disk_size_gb = 30
    }
 
    service_principal {
        client_id     = "${var.client_id}"
        client_secret = "${var.client_secret}"
    }
 
    tags {
        Environment = "Development"
    }
}

We also describe the variables that will be output by the execution of our terraform script.

# output.tf
 
output "client_key" {
    value = "${azurerm_kubernetes_cluster.k8s.kube_config.0.client_key}"
}
output "client_certificate" {
    value = "${azurerm_kubernetes_cluster.k8s.kube_config.0.client_certificate}"
}
output "cluster_ca_certificate" {
    value = "${azurerm_kubernetes_cluster.k8s.kube_config.0.cluster_ca_certificate}"
}
output "cluster_username" {
    value = "${azurerm_kubernetes_cluster.k8s.kube_config.0.username}"
}
output "cluster_password" {
    value = "${azurerm_kubernetes_cluster.k8s.kube_config.0.password}"
}
output "kube_config" {
    value = "${azurerm_kubernetes_cluster.k8s.kube_config_raw}"
}
output "host" {
    value = "${azurerm_kubernetes_cluster.k8s.kube_config.0.host}"
}

We first initialize the terraform state.We then check the plan before it get executed.We now ask terraform to create all the resources described in the .tf files including the AKS Kubernetes cluster.

$ terraform init
$ terraform plan
$ terraform apply

We know have our running managed Azure Kubernetes cluster(AKS). We define a shell script file containing the set of variables required to run kubectl (the kubernetes client)

more setvars.sh 
#!/bin/sh
echo "Setting environment variables for Terraform"
export ARM_SUBSCRIPTION_ID="07ea84f6-0ca4-4f66-ad4c-f145fdb9fc37"
export ARM_CLIENT_ID="7b6c7430-1d17-49b8-80be-31b087a3c75c"
export ARM_CLIENT_SECRET="7957f6ca-d33f-4fa3-993a-8fb018c11fe1"
export ARM_TENANT_ID="e1be9ce1-c3be-436c-af1d-46feecf854cd"
export TF_VAR_client_id="7b6c7430-1d17-49b8-80be-31b087a3b76c"
export TF_VAR_client_secret="7957f6ca-d33f-4fa3-993a-8fb019c10fe1"
export KUBECONFIG=./azurek8s
$ . ./setvars.sh 
$ echo "$(terraform output kube_config)" > ./azurek8s

This will basically export the kubernetes configuration file required to be able to use the kubectl command. We can now start to use the kubectl kubernetes client. Let us create a kubernetes secret containing the mariadb password.

$ kubectl create secret generic mariadb-password --from-literal=password=password1
 
$ az aks get-credentials --resource-group azure-k8stest --name k8stest
$ az aks browse --resource-group azure-k8stest --name k8stest

We now create a persistent volume claim. This is very important for MariaDB as MariaDB is a stateful application and we need to define the storage used for persisting the database. We then deploy the mariadb docker container which will use this physical volume for the database storage. we then create a service to expose the database.

$ kubectl apply -f persistent-volume-claim.yaml 
$ kubectl apply -f mariadb.yaml 
$ kubectl apply -f mariadb-service.yaml

Once the service is created the service discovery can be used. Kubernetes offers an internal dns and a ClusterIP associated with the the service. This model make a lot of sense as it abstract us from the pod IP address. Once you have created a kubernetes MariaDB service it can be accessed from another container running inside this same cluster. In that case a mysql client.

$ kubectl expose deployment mariadb --type=NodePort --name=mariadb-service
$ kubectl run -it --rm --image=mariadb --restart=Never mysql-client --  mysql -h mariadbservice -ppassword1

It is also possible to expose the mariadb service through a public IP. This is of course something usually not done for security reasons. kubernetes asks Azure to create a load balancer and a public IP address that can then be used to access the mariaDB server.

$ kubectl expose service mariadbservice --type=LoadBalancer --name=mariadb-service1 --port=3306 --target-port=3306
$ mysql -h 104.43.217.67 -u root -ppassword1

In the scenario we presented the infrastructure is described through code (terraform hcl language). We then use the kubernetes client and yaml kubernetes configuration file to declaratively describe software objects configuration.
It makes a lot of sense to use a managed version of kubernetes as Kubernetes itself is a standard building block that offers the same features on all cloud providers. Kubernetes being standardize with this approach we keep the freedom to run the same open source software stack (here MariaDB) across different cloud providers .

How to rapidly provision a MariaDB in the cloud ? Various option are available.
A very effective approach is to provision MariaDB with Terraform. Terraform is a powerful tool to deploy infrastructure as code. Terraform is developed by Hashicorp that started their business with the very successful Vagrant deployment tool. Terraform allows you to describe through HCL langage all the components of you infrastructure. It is aimed to make it easier to work with multiple cloud providers by abstracting the resources description.
Keep on reading!

Oracle has done a great technical work with MySQL. Specifically a nice job has been done around security. There is one useful feature that exists in Oracle MySQL and that currently does not exist in MariaDB.
Oracle MySQL offers the possibility from within the server to generate asymetric key pairs. It is then possible use them from within the MySQL server to encrypt, decrypt or sign data without exiting the MySQL server. This is a great feature. This is defined as a set of UDF (User Defined Function : CREATE FUNCTION asymmetric_decrypt, asymmetric_encrypt, asymmetric_pub_key … SONAME 'openssl_udf.so';).
Keep on reading!

Oracle MySQL 8.0 has been declared GA but a critical piece is missing …
MySQL 8 is a fantastic release embedding the work of brilliant Oracle engineering.
I will not detail all the great features of MySQL 8 as there are a lot of great presentations around it. https://mysqlserverteam.com/whats-new-in-mysql-8-0-generally-available/
Keep on reading!

Comparing Oracle MySQL Group Replication and Galera Cluster through a probability perpective seems quite interesting.

At commit time both use a group certification process that requires network round trips. The required time for these network roundtrips is what will mainly determined the cost of a transaction. Let us try to compute an estimate of the certification process cost. The duration of these network roundtrips duration can be model by random variable with an associated probability distribution law.
Keep on reading!

MariaDB 10.1 introduced Data at Rest Encryption. By default we provide a file_key_management plugin. This is a basic plugin storing keys in a file that can be itself encrypted. This file can come from a usb stick removed once keys have been brought into memory. But this remains a basic solution not suitable for security compliance rules.
Keep on reading!

A question raised by my previous post is : What about MariaDB and native JSON support ? In my previous post I mentioned the possibility to use the MariaDB CONNECT storage Engine to store and access JSON content in normal text field. Of course having a native JSON datatype brings more value. It introduces JSON validation, a more efficient access to attribute through an optimized storage format.
Keep on reading!

It is not new that we can store a JSON content in a normal table text field. This has always been the case in the past. But two key features were missing : filtering based on JSON content attributes and indexing of the JSON content. With MariaDB 10.1 CONNECT storage Engine we offer support for external content including JSON files. The MariaDB CONNECT storage Engine also comes with a set of JSON related UDFs. This allows us to do the following thing :
Keep on reading!

The JSON format includes the concept of array. A JSON object cant contain an attribute of array type. We have seen that we can use the MariaDB CONNECT Storage Engine provided UDFs (user defined functions) to implement dynamic columns.
Keep on reading!