The new CONNECT Storage Engine with MariaDB 10.0.2

SkySQL and MariaDB are very pleased to announce a new baby in the storage Engine family : The “CONNECT Storage Engine”
I have had the pleasure to work with Olivier Bertrand creator of the CONNECT storage Engine. Olivier has over 50 years of experience in database development as an ex-IBM database researcher. Olivier was very interested in adding the result of his experience around BI data access to MariaDB. He wanted a more versatile way to access external data sources in various formats. The CONNECT storage engine enables MariaDB to directly use external data as if it were tables in the server without actually being obliged to load them into the physical database.
Olivier has been welcomed by the MariaDB team and has been working with them to integrate his work, build test cases and fix remaining bugs. The CONNECT storage engine is 100% open source and released under the GPL license. 

The integration of CONNECT to MariaDB is yet another example of the innovation going on around the MariaDB server. The “MariaDB  CONNECT Storage Engine” adds one more innovative differentiation to the MariaDB ecosystem (Cassandra, LevelDB, TokuDB, Spider being other nice examples). All these extra features are available under the GPL and they will come with the standard MariaDB release.

So yes the storage engine arena is very lively. Stewart was worried in his post Where are they now: MySQL Storage Engines . I think that storage engines is where innovation is taking place to turn MariaDB into a database platform.

The CONNECT storage engine give you access to various external data sources either local or remote. This includes files in various formats such as DBase, CSV, DOS, FMT and XML.
It also includes access to remote MySQL tables and almost any other data-source through an ODBC supported target.

You can also build a table based on a list of tables (equivalent of the MERGE storage engine). These table list (TBL) can be built on any type of table including any CONNECT table. This means that you can build a table whose data is distributed across multiple remote servers.

You can also define a table with per column storage (VEC type). This can greatly benefit tables with a lot of columns where only a limited subset is queried.
The CONNECT storage engine offers a major enhancement of what was previously offered by a combination of the CSV, MERGE and FEDERATED storage engines.

This storage engine is smart :
it can auto discover the structure of your data based on the metadata contained within the file.
it can index your data, optimizing for read only access.
it can work on compressed data.

It is a storage engine. It is built into MariaDB 10 which means that it can be used once the server is up and running. Just start creating tables with the “engine=connect” option and specify the files or the remote tables you want to access. This is documented with various examples.
As with any storage engine it is built using the MySQL Plugin API. It is a smart storage engine that uses condition push down to filter at the target data source. This can greatly increase performance by limiting data transfer. It also implements the limit clause of select. This storage engine has been designed with Business Intelligence (BI) queries in mind . It is not designed to be used with Online Transaction Processing (OLTP) activity. This choice allows for the optimization of code by leaving out a lot of costly internal mechanisms needed only for OLTP. As with any storage engine your external data source data can be joined with your local data.

When should you use the CONNECT storage engine ? When you have data in various file formats or remotely available or both and you would like to query this data. The main benefit of this approach to BI is that there is no need to go through a lengthy ETL (Extract, Transform, Load) process. The data can be directly accessed in place.

If you want to talk about it you are welcomed at the free SkySQL/MariaDB event immediately following Percona Live 2013 : “MySQL & Cloud Database Solutions Day”
To try it
CONNECT storage Engine documentation in MariaDB knowledge base
SkySQL support the CONNECT storage Engine
Get the white paper

4 comments to The new CONNECT Storage Engine with MariaDB 10.0.2

  • Would it be possible to make it somewhat compliant with the SQL/MED standard?

    • Serge

      That is an excellent suggestion. We are looking at the SQL/MED (Management of External Data, extension to the SQL standard).

  • Shlomi

    Hi

    We’re currently testing usage of MySQL Federated engine, used for read-only cached tables.

    The Connect engine looks like a very interesting option as it will allow us to use the index/condition push down, thus reducing the amount of network & total query time.

    Do you have any experience with using Connect as a MySQL engine? migrating to MariaDB is not possible at current time frames, though it is on our roadmap

    Thanks!

    Shlomi

    • Serge

      MariaDB Connect Storage Engine takes benefit from some MariaDB specific features :
      It uses the autodiscovery of tables which avoid to describe the table structure when it can be guessed from the target (ODBC, MySQL …)
      It also uses the ability for a storage engine to add parameter to the create table. This is more clean than to use comment.

Leave a Reply

  

  

  


4 × nine =

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">