MySQL Fabric is a very promising sharding framework. If I take Ulf Wendel definition of MySQL Fabric :
MySQL Fabric is an administration tool to build large “farms” of MySQL servers. In its most basic form, a farm is a collection of MySQL Replication clusters. In its most advanced form, a farm is a collection of MySQL Replication clusters with sharding on top.
So MySQL Fabric takes care of two very orthogonal features :
- High availability of servers
- Sharding of data
Let us forget about sharding and look at the High availability infrastructure.
Servers are included in groups, called "High Availability Groups" when we talk about HA.
Each Server has an associated Status (or Role): primary secondary, spare
Each Server has also a mode : Offline, Read-only, and Read-Write.
The implementation has been made to allow various HA implementation patterns.
The most common HA pattern is the Master/Slave HA group ( in that case we should call it a "replica set" which is the terminology used in MongoDB or Facebook MySQL Pool Scanner (MPS).
Mats Kindahl in his blog post on MySQL Fabric High Availability Groups mentioned that other HA solutions are possible for an availability group :
- Shared Storage with SAN or NAS
- Replicated storage like DRBD
- MySQL Cluster shared nothing cluster
In the case of a HA group based on MySQL Cluster the group is self-managing regarding HA and MySQL Fabric does not handles the failover. With the "Shared Storage" and "Replicated storage" availability groups the secondary servers will be offline.
So one of my ideas that I hope is feasible would be to use MariaDB Galera Cluster as another HA solution with MySQL Fabric. The main advantage of this solution relates to the characteristics of MariaDB Galera Cluster. MariaDB Galera Cluster is an Active-active multi-master topology with synchronous replication. MariaDB Galera Cluster being innoDB based does not carry all the usage limitations associated with MySQL Cluster (main one being limited join capabilities).
Regarding to MySQL fabric the behavior of an availability group based on MariaDB Galera Cluster is identical to MySQL Cluster. It is a self-managing availability group.
MariaDB Galera Cluster
MariaDB Galera Cluster
Getting Started with MariaDB Galera Cluster
MySQL Fabric: A new kid in the MySQL sharding world 2013-10-09 Serge Frezefond
MySQL Fabric: High Availability Groups 2013-10-21 Mats Kindahl
A Brief Introduction to MySQL Fabric 2013-09-21 Mats Kindahl
MySQL Fabric - Sharding - Introduction 2013-09-21 VN (Narayanan Venkateswaran)
MySQL Fabric - Sharding - Simple Example 2013-09-22 VN (Narayanan Venkateswaran)
MySQL Fabric - Sharding - Shard Maintenance 2013-09-27 VN
MySQL Fabric - Sharding - Migrating From an Unsharded to a Sharded Setup 2013-09-22 VN
Installing MySQL Fabric on Windows 2013-10-03 Todd Farmer
MySQL 5.7 Fabric: any good? 2013-09-23 Ulf Wendel
Writing a Fault-tolerant Database Application using MySQL Fabric 2013-09-21 Alfranio Junior
Sharding PHP with MySQL Fabric 2013-10-09 Johannes Schlüter
MySQL Fabric support in Connector/Python 2013-09-22 Geert Vanderkelen
MySQL Connector/J with Fabric Support 2013-09-21 Jess Balint
As alway with new technology there is always different approaches regarding the adoption.
You can try to use the bleeding edge features or start with a very standard configuration.
My personal advise to new users is to start with the most basic configuration.
This allow you to get familiar with the fundamentals :
- how to install
- how to operate
- how to monitor
For MariaDB Galera Cluster he most basic configuration is a 3 nodes cluster. You can chose to use it :
as an HA solution
Galera Cluster is currently the easiest way to solve the HA problem.
when you think HA think Galera Cluster it is so much simpler. Failover is totally transparent and you have nothing to do like you would have with standard replication. if you have a lodbalancer in front you just have to push out from the configuration THE FAILED NODE and that is done.
as a scale out solution
A usual MySQL scale out architecture is based on master/slaves architecture. This solution incurs to the application the choice of where to send the read (Master or most up to date slave…). Scale out of read with Galera cluster synchronous replication is much simpler. Nothing need to be done at the application layer.
You have a synchronous data up to data available on all nodes . You do not have the risk to read stale data when the replication lags. Nothing need to be done at the application level like taking care of reading were you write to have correct data.
Contrary to usual HA solutions or compare to MySQL cluster Galera Cluster is very simple to setup and operate.
Getting Started with MariaDB Galera Cluster
Then of course next step is to push Galera usage a little bit further. One main area is of course write scalability.
A few benchmark have been produced but we do not have yet much experience and you have to be more careful. With write/write configurations you have to be careful about hot spots in the database. This can lead to deadlock and the behavior has to be correctly understood.
A good distribution for download to do your testing is : MariaDB Galera Cluster 5.5.29 Stable
Some useful pointers to understand various behaviours of Galera Cluster.
------- upcoming events :
Tomorrow Seppo Jaakola, Codership, will present Galera Cluster for MySQL & MariaDB at the Meetup SkySQL & MariaDB - Paris
Henrik Ingo, Max Mether and Colin Charles will present "MariaDB Galera Cluster Overview" at the free MySQL & Cloud Solutions Day taking place in Santa Clara the 26th of April. You can register for free.
Last week Solutions Linux / Open Source event was held in Paris.
Kuassi MENSAH (Head of Product Management Database Technologies, Oracle Corporation) presented the open source Oracle strategy. Linux, MySQL, virtualization, GlassFish, Eclipse, dynamic scripting languages ,... etc . It was well received by the audience. Knowing that MySQL organization will be kept safe in Oracle is perceived as a nice move.
Florian Haas(LINBIT) gave a tutorial on DRBD and did some demos with NFS and video streaming. And of course he reminded people that now since Linux 2.6.33, DRBD is officially integrated into the Linux kernel source. DRBD making the push for mainline Linux kernel is going to make HA easier.
As is normal now, some subjects are getting stronger :
The cloud was a hot topic with a good keynote by UBUNTU. Virtualization related to cloud architecture was also presented by hosting providers.
Extreme scaleout with Hadoop (Yahoo implementation of Google Map reduce) get a lot of interest.
NoSQL databases solutions was also a hot subject. I must confess that I was skeptical about broad usage of NoSQL beside the top web actors.
But this times I met my first customers talking seriously about it and doing serious testing with Cassandra and Voldemort.
So I decided to follow a tutorial on using Hadoop. very interesting.
Microsoft was gold sponsor of the event. Microsoft presented themselves as now as a good friendly contributor to the Linux kernel. Beside that they try to optimize all open source solutions running on Windows Server. And of course they want to play a key role in the new cloud economy. In their Azure cloud framework they can handle Linux, PHP, MySQL . They seem to sincerely want to be active and friendly in the open source segment.
There was a full set of tutorials on database technology. Postgresql, NoSQL technologies, Ingres, PostGIS were there.
I presented a tutorial on MySQL. Our MySQL users and customers are always as passionate. They are eager to use our next production releases. Many of them have started using InnoDB pluggin and its advanced capabilities (compression, fast index rebuild ...). Some have started testing MySQL 5.5 and are very impressed by all the performance gains + extra functionalities. MySQL is really on it way to become "the fasted open source commodity database on the market". I of course got some questions on when we will release GA version. MySQL proxy GA was also mentioned and asked for.