My aim here is to experiment a new approach with Azure database for MySQL/MariaDB and ProxySQL.
When a connection is initiated with Azure database for MySQL/MariaDB it goes through a shared gateway. This gateway knows where the target database server is. This level of indirection is very useful to offer High Availability in a way transparent to the client application. But by default all subsequent traffic (queries) will also go through the gateway. This extra hop create some performance penalty.
The redirection mechanism for Azure Database for MySQL
Is there a way to bypass the gateway and to directly connect to the target database and to send queries directly to the target ? : Yes. This optimization mechanism is called connection redirection in the Azure world. It needs to be activated on the PaaS Azure database to be usable by the client. Once activated the gateway will fill the last packet of the connection/authentication protocol with some extra information : the hostname of the target database the port and the username. This packet is called the OK packet in the MySQL protocol.
The OK packet that contains the address of the target database server can be analyzed by the client. This means that there can be a smart way to use this information to directly connect and send subsequent connections and queries directly to the target database.
Redirection implemented in Azure MySQL/MariaDB connectors
This is exactly what has been implemented with 2 Azure MySQL connectors that implement connection redirection : the PHP connector and the Connector J for java. To make this efficient the connector cache this information in a hashmap and direct all subsequent connection directly to the target database. A connection would only be sent to the gateway if the direct connection fails. This could happen if the HA mechanism had restarted the database on a different server.
Connection redirection implemented at the connector level is great. The only issue with this approach is that you have to use a Azure specific connectors. Currently the modifications made have not been integrated upstream and is available for a very little set of connectors. For many users that use package manager to get their software they will not get the connector version with this redirection feature. The same problem occurs if you are using software prepackaged as container images that embed a predefined version of the connector.
Making ProxySQL redirection aware ?
How can we solve this issue in an elegant way 🙂 ? Is there another approach to this problem that would not require the use of specific connectors ? Why not include this redirection feature directly in ProxySQL. In this way any application whatever the connector used will benefit from the redirection mechanism if the connection goes through ProxySQL. Many customers already use proxy SQL for connection pooling/multiplexing, load balancing, caching.
Using ProxySQL for connection redirection seems an interesting approach to explore. ProxySQL relies on MariaDB connector C library to connect to MySQL or MariaDB. During a connection to the server the gateway once the connection is correct returns an OK packet that contains all the information required to connect directly to the back-end server. Once a connection has been established through the gateway the caching of the target database is very effective as ProxySQL is a middleware used by many clients. This is less true for a connector based approach.
What we tested
This hack works but was it worth the effort ? How can we benchmark this feature to measure the benefit ? We will use sysbench with 3 different scenarii :
- normal connection through the gateway. All queries will also go through the gateway. This is the behavior you get by default.
- connection through ProxySQL with redirection mechanism implemented and activated
- connection through Proxysql without redirection mechanism. This is the behavior you get with standard ProxySQL
The Azure Database for MySQL Server is General purpose 2 vCores 100G
I used sysbench 1.1 to get the reconnect option that force re-connection. I used "reconnect=1" which force reconnect after every query. This is to represent a PHP application for which you get connection at each page. This is an extreme case not representative of a real workload (even though we never know 😉 )
I used the script 'oltp_point_select.lua' not to have too much stress on the database.
The first test was run by injecting from my Windows10 laptop WSL linux subsystem running sysbench and ProxySQL. It goes through the internet to reach the Azure Database for MySQL. This is not production practice but it exacerbate the network impact.
Test with proxySQL redirection
When looking at these results we notice important differences :
We get a much higher throughput when going through ProxySQL
We get a much lower latency 95 percentile which means that most queries end faster.
We get a very stable execution time with small outliers. On the opposite with access through the gateway we get big outliers even under small workload.
Test with ProxySQL and no redirection
But do these results come from the redirection or from ProxySQL. Let us look at sysbench result for the same workload :
In fact most of the benefit comes from ProxySQL. The throughput are almost the same.
The only significant difference is the stability of the response time. ProxySQL with redirection gives more stable results with a smaller standard deviation.
ProxySQL and Azure Database for MySQL/MariaDB are a great fit. The connection pooling implementation gives a great benefit. This covers a wide range of languages/frameworks and applications that do not offer a connection pooling mechanism and that generate a lot of connections ( PHP, WordPress, Drupal, Magento ...). In fact in many cases ProxySQL is a mandatory requirement to get great performance and fully use the power of your Azure Database for MySQL/MariaDB.
Implementing connection redirection at the ProxySQL level is something that is certainly worth exploring for execution time stability.
other tests :
Some valuable links by Amol Bhatnagar ( @ambhatnaif ) if you want to implement ProxySQL with Azure Database for MySQL/MariaDB.
Load balance read replicas using ProxySQL in Azure Database for MySQL
Deploy ProxySQL as a service on Kubernetes using Azure Database for MySQL
Scaling an Azure Database for MySQL workload running on Kubernetes with read replicas and ProxySQL
Connecting efficiently to Azure Database for MySQL with ProxySQL
Set up ProxySQL clustering to connect efficiently to Azure Database for MySQL
Ingest ProxySQL Metrics into the Azure Monitor Log Analytics Workspace