MariaDB CONNECT Storage Engine and JSONPath

MariaDB CONNECT Storage Engine allows to access a JSON file as if it was a local table.
The mapping between a hierarchical structure and a relational structure is not ‘natural’ and needs to be defined. To achieve that we need a specific syntax. We associate with each column a ‘field_format’ attribute. This ‘field_format’ column attribute defines what property in the JSON hierarchy we want to map to the MariaDB table column.

With field_format=’AUTHOR:*’ the AUTHOR column which is an array of objects will remain in JSON format (We use the same JSON file example as in the previous post).

MariaDB > create table bibop1 (
     ISBN char(15),
     AUTHOR char(200) field_format='AUTHOR:*',
     PUBLISHER char(20) field_format='PUBLISHER:NAME')
     engine=CONNECT table_type=JSON file_name='/var/lib/mysql/json/biblio3.jsn';
 
MariaDB > select * from bibop1;
+-------------------------------------------------------------------------------------|
| ISBN          | AUTHOR                                            | PUBLISHER       |
+-------------------------------------------------------------------------------------|
| 9782212090819 | [{"FIRSTNAME":"Jean-Christophe","LASTNAME":"Bernadac"},
…

With field_format=’AUTHOR:[]‘ the AUTHOR column will be represented as a coma separated list. Each JSON object is represented as a concatenated string of its properties.

+---------------+------------------------------------------------------------------------|
| ISBN          | AUTHOR                                               | PUBLISHER       |
+---------------+------------------------------------------------------------------------|
| 9782212090819 | Jean-Christophe Bernadac, John Knab, Charles Dickens | Eyrolles        |

To obtain a normalize view of Author first name and last name we associate column AuthorFN with field_format=’AUTHOR:[X]:FIRSTNAME’ and obtain a presentation with each author split on a line.

+---------------+-----------------+----------+
| ISBN          | AuthorFN        | AuthorLN |
+---------------+-----------------+----------+
| 9782212090819 | Jean-Christophe | Bernadac |

Here is a more exhaustive view of the various possibilities offered for the mapping:
We will see later that we can also use this syntax to modify a JSON document including array embedded in an objects.

Specification Array Type Limit Description
[n] All N.A Take the nth value of the array. Ignore it if n is 0.
[X] or [x] All Expand. Generate one row for each array value.
["string"] String Concatenate all values separated by the specified string.
[+] Numeric Make the sum of all the array values.
[*] Numeric Make the product of all array values.
[!] Numeric Make the average of all the array values.
[>] or [<] All Return the greatest or least value of the array.
[#] All N.A Return the number of values in the array.
[] All Expand if under an expanded object. Otherwise sum if numeric, else concatenation separated by “, “.
All N.A If an array, expand it if under an expanded object or take the first value of it.

We all know how XPAH works for XML documents. There has been some work done to define a JSONPath syntax to access JSON document content. The scope of what has been defined by JSONPath is broader than what we need for the MariaDB CONNECT storage Engine to define the mapping between table columns and JSON object properties. Here is the syntax available with JSONPath (with equivalent XPATH syntax):

XPath JSONPath Description
/ $ the root object/element
. @ the current object/element
/ . or [] child operator
.. n/a parent operator
// .. recursive descent. JSONPath borrows this syntax from E4X.
* * wildcard. All objects/elements regardless their names.
@ n/a attribute access. JSON structures don’t have attributes.
[] [] subscript operator. XPath uses it to iterate over element collections and for predicates. In Javascript and JSON it is the native array operator.
| [,] Union operator in XPath results in a combination of node sets. JSONPath allows alternate names or array indices as a set.
n/a [start:end:step] array slice operator borrowed from ES4.
[] ?() applies a filter (script) expression.
n/a () script expression, using the underlying script engine.
() n/a grouping in Xpath

A more complete description of What has been proposed for JSONPath is here
MySQL JSON UDF functions version 0.4.0 use part of this syntax for function like JSON_EXTRACT or JSON_SEARCH.
With MySQL JSON udf you can also express a path as a variable list of keyparts. Each keypart being a JSON property or an index in a JSON array : select json_extract(doc, keypart1, keypart2 ...)
PostgreSQL with its native JSON also uses a variable list of path elements to define a JSON path. For example
json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}','f4', 'f6') where the keypath is 'f4', 'f6' return foo.

We will discover at Percona Live what syntax and set of helper functions have been chosen by Facebook and Oracle for their respective MySQL Native JSON type implementation. In fact I do not know if the two implementations are different. We will see.

The JSONPath syntax also allows to define queries on the JSON data. The following JSONPath query

$..book[?(@.price<10)]

will filter all books cheaper than 10 when applied to the following JSON document :

{ "store": {
    "book": [
      { "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      { "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      { "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      { "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
      "color": "red",
      "price": 19.95
    }
  }
}

With CONNECT we will first do the mapping of JSON properties to table columns. We will then use plain SQL to do the query.

As MariaDB CONNECT Storage Engine JSON functionality is still experimental your feedback on bugs, improvements, features is welcome. To fill bug it is https://mariadb.atlassian.net/projects/MDEV

Additional info :
MariaDB Knowledge Base Documentation for CONNECT Storage Engine CONNECT
github repository for MariaDB 10.0 including CONNECT Storage Engine
Presentation at future Percona Live 2015 SantaClara conference :
MariaDB CONNECT Storage Engine to simplify heterogeneous data access (JSON focus)

MariaDB CONNECT storage engine now offers access to JSON

The MariaDB CONNECT storage engine now offers access to JSON file and allows you to see a external JSON file as a MariaDB table. JSON (JavaScript Object Notation) is a lightweight data-interchange format widely used on the Internet.
JSON like XML represents data hierarchically. The mapping from hierarchical data to tabular data needs to be specified.

To use the CONNECT storage engine JSON capabilities the  ’create table‘ describes what should be retrieved from the file and how it should be processed. Let us start from the file “biblio3.json”. This file consists in an array of objects. Some object’s properties have object or array values :

[
{
  "ISBN": "9782212090819",
  "LANG": "fr",
  "SUBJECT": "applications",
  "AUTHOR": [
    {
      "FIRSTNAME": "Jean-Christophe",
      "LASTNAME": "Bernadac"
    },
    {
      "FIRSTNAME": "François",
      "LASTNAME": "Knab"
    }
  ],
  "TITLE": "Construire une application XML",
  "PUBLISHER": {
    "NAME": "Eyrolles",
    "PLACE": "Paris"
  },
  "DATEPUB": 1999
  },
  {
    "ISBN": "9782840825685",
    "LANG": "fr",
    "SUBJECT": "applications",
    "AUTHOR": [
      {
        "FIRSTNAME": "William J.",
        "LASTNAME": "Pardi"
      }
    ],
    "TITLE": "XML en Action",
    "TRANSLATED": {
      "PREFIX": "adapté de l'anglais par",
      "TRANSLATOR": {
        "FIRSTNAME": "James",
        "LASTNAME": "Guerin"
      }
    },
    "PUBLISHER": {
      "NAME": "Microsoft Press",
      "PLACE": "Paris"
    },
    "DATEPUB": 1999
  }
]

We can see this JSON hierarchical collection of objects that way :

                      |                  …                             |
            <BOOK:ISBN,LANG,SUBJECT>                     <BOOK:ISBN,LANG,SUBJECT>
        ______________|_______________          ________________________|__________________
       |        |         |          |          |            |         |        |         |
   <AUTHOR> <TITLE> <PUBLISHER> <DATEPUB>   <AUTHOR> <TRANSLATOR> <TITLE> <PUBLISHER> <DATEPUB>
    ____|____            ___|____          _____|_        ___|___            ___|____ 
   |    |    |          |        |        |       |      |       |          |        | 
<FIRST> | <LAST>     <NAME>   <PLACE>   <FIRST> <LAST> <FIRST> <LAST>     <NAME> <PLACE>
        |
     <AUTHOR> 
    ____|____   
   |         |
<FIRST>   <LAST>

The create table options allow to specifying how JSON file’s properties will be mapped to table columns :

MariaDB > create table bibop1 (
     ISBN char(15),
     AUTHOR char(200) field_format='AUTHOR:*',
     PUBLISHER char(20) field_format='PUBLISHER:NAME')
     engine=CONNECT table_type=JSON file_name='/var/lib/mysql/json/biblio3.jsn';
 
MariaDB > select * from bibop1;
+-------------------------------------------------------------------------------------|
| ISBN          | AUTHOR                                            | PUBLISHER       |
+-------------------------------------------------------------------------------------|
| 9782212090819 | [{"FIRSTNAME":"Jean-Christophe","LASTNAME":"Bernadac"},
                   {"FIRSTNAME":"John","LASTNAME":"Knab"},
                   {"FIRSTNAME":"Charles","LASTNAME":"Dickens"}]
                                                                    | Eyrolles        |
| 9782840825685 | [{"FIRSTNAME":"William J.","LASTNAME":"Pardi"}]
                                                                    | Microsoft Press |
+-------------------------------------------------------------------------------------|

In that table part of the JSON format is preserved as a JSON text. This happens as we have defined the AUTHOR column with field_format=’AUTHOR:*’

MariaDB > create table bibop2 (
     ISBN char(15),
     AUTHOR char(200) field_format='AUTHOR:[]',
     PUBLISHER char(20) field_format='PUBLISHER:NAME')
     engine=CONNECT table_type=JSON file_name='/var/lib/mysql/json/biblio3.jsn';

In that case the array is represented by values comma separated(we could choose a different separator like a word ‘AND’). This happens as we have defined the AUTHOR column with field_format=’AUTHOR:[]‘

MariaDB > select * from bibop2;
+---------------+------------------------------------------------------------------------|
| ISBN          | AUTHOR                                               | PUBLISHER       |
+---------------+------------------------------------------------------------------------|
| 9782212090819 | Jean-Christophe Bernadac, John Knab, Charles Dickens | Eyrolles        |
| 9782840825685 | William J. Pardi                                     | Microsoft Press |
+---------------+------------------------------------------------------------------------|

We can also choose to normalize the presentation with one author name by row.We can define the AuthorFN column with field_format=’AUTHOR:[X]:FIRSTNAME’

MariaDB > create table bibop3 (
     ISBN char(15),
     AuthorFN char(128) field_format='AUTHOR:[X]:FIRSTNAME',
     AuthorLN char(128) field_format='AUTHOR:[X]:LASTNAME')
     engine=CONNECT table_type=JSON file_name='/var/lib/mysql/json/biblio3.jsn';
 
MariaDB > select * from bibop3;
+---------------+-----------------+----------+
| ISBN          | AuthorFN        | AuthorLN |
+---------------+-----------------+----------+
| 9782212090819 | Jean-Christophe | Bernadac |
| 9782212090819 | John            | Knab     |
| 9782212090819 | Charles         | Dickens  |
| 9782840825685 | William J.      | Pardi    |
+---------------+-----------------+----------+

All these CONNECT tables can exist simultaneously. They can be considered as view on the same underlying raw JSON file. The CONNECT storage engine keeps with JSON the same principle it does with other data sources : the data remains fully external.

MySQL going to challenge MongoDB with native JSON ?

When looking at Percona live Santa Clara 2015 agenda I saw two intriguing conferences :
- JSON support in MySQL 5.7
– Docstore: document database for MySQL at Facebook

Both conference are talking about supporting a native JSON type in MySQL ! This sounds really great. I have never seen any blog post or github repository around this subject. This is a good surprise. Having in MySQL a native JSON datatype with set of functions for handling JSON documents, with indexing capabilities, and with good underlying storage sounds very cool!

MongoDB and Couchbase as documents stores have open the road to JSON usage by a database. They built their database on a JSON storage format. PostgreSQL has also introduced a native JSON type with associated functions.

But MongoDB storage foundation were not that solid. Memory-mapped files were the critical piece of the storage engine in MongoDB. The locking with was done at database level. The atomicity of a commit was at a document level. All this impacted concurrency capabilities of the data store.
The NoSQL paradigm of MongoDB also means no join capabilities. It has lead to highly denormalised data model. Excessive denormalisation can hurt as it incurs high space consumption. Efficient indexing required padding to avoid too much work rebuilding index when data changed.  The schema-less nature of the database were the schema is in the code is problematic for application evolution.

Then came TokuDB that build TokuMX to solve some of the flaws of MongoDB. The nice idea was to use the efficient storage capabilities of TokuDB. The storage was more solid with true recoverability, transactions, excellent compression and online operations.

MongoDB maybe inspired by what TokuMX achieved has recently decided to change their product’s architecture. MongoDB can now use the WiredTiger storage engine.  This provides document-level locking and compression. But as we all know changing the foundation a house while you live in is not that easy.

The MySQL approach is different. The foundation are very solid. The InnoDB transactional storage engine has huge qualities : high concurrency, reliability, safe crash recovery, on line operations, reliable replication capabilities …

Building a document store on these foundations might lead to some good end result. We have to wait for feedback from the field . Benchmarks need to be run to show how much one can expect from the MySQL /JSON approach. With the newly added JSON support in MySQL, you can combine the flexibility of NoSQL with the strength of a relational database. I am a big fan of this hybrid approach. This dual approach mixing relational content with the flexibility of schemaless  JSON might turn very attractive in a developer perspective.

 

 

Since now on github is the place for MariaDB new dev !

Today while browsing through my emails I was very happy to read this email from Sergei Golubchik :

Hi!
I'm happy to announce that MariaDB-10.1 tree has been completely migrated to github. Since now on
 we'll use github for the new development in MariaDB. It's https://github.com/MariaDB/server, 
go on, fork it, hack around, submit pull requests. Have fun!
Older trees (10.0, 5.5, 5.3, 5.2, 5.1) are not on github - we do hope to migrate 
them too eventually, but at the moment they are still on launchpad.
If you're a maria-captain on launchpad - for you to get write access to the mariadb repository 
on github you need to tell me your github account name.
Regards,
Sergei
P.S.: Don't forget that 10.1 is still *pre-alpha*. You've been warned.
_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to     : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp

So if you want to embark the MariaDB adventure
it is here : https://github.com/MariaDB/server

Go on,
Fork it,
Hack around,
Submit pull requests.
Have fun!

MaxScale has now its own public irc channel

MaxScale is a Proxy for the MySQL protocol built with a modular architecture.
The underlying concept of modules allows to extend the MaxScale proxy services. The current version implements Read Write splitting and Connection Load Balancing. Internally MySQL queries go through a SQL parsing phase. This gives MaxScale great capabilities regarding queries routing.

So if you are interested by MaxScale do not hesitate to enter the chat room #maxscale on irc.freenode.net. its developers will be there. Bring your ideas and problems. You are welcome.

The MaxScale sources code is on github

THere is a MaxScale Google group available

And a good starting point on how to install and configure MaxScale is Ivan Zoratti’s blog post

MaxScale, ProxySQL and MySQL Proxy

At FOSDEM 2014 ProxySQL and MaxScale were both presented. Both are proxy that can help build sophisticated MariaDB/MySQL architectures.
But currently what is the most used proxy with MySQL? It is HAproxy. HAproxy is a level 4 proxy that has no knowledge of the MySQL protocol. Being low level makes it very fast but it cannot accomplish any advanced proxy task. In many case this is enough for pure load balancing. But it cannot handle filtering, routing, query rewriting. This requires to understand the MySQL protocol and to look at the query passing through the proxy.

ProxySQL and MaxScale contrary to HAproxy are level 7 proxies. This means that they both understand the MySQL network protocol and they have access to the query payload.

At FOSDEM ProxySQL was presented by its developer René Cannaò. ProxySQL has interesting features like a very efficient query cache, query rewriting capabilities and a request queuing mechanism when the backend is not available. ProxySQL use regex to do query filtering and rewriting. It does not provide full query parsing.

Maxscale was presented by two of its developers Massimiliano Pinto and Vilho Raatikka. MaxScale is a more ambitious project. It supports a modular architecture. The underlying concept of modules allows to extend the MaxScale proxy services. The current version implements Read Write splitting and Load Balancing. Internally MySQL queries go through a SQL parsing phase. This gives MaxScale great capability regarding queries routing.

Talking about proxy technology without talking about MySQL Proxy would not be complete. MySQL Proxy was mentioned during presentations. When MySQL Proxy was born it attracted a lot of users as it was a very promising technology. The extensibility capability through the LUA scripting language makes it a very flexible technology. MySQL proxy does not embed a SQL parser and basic tokenization was made through LUA language. Unfortunately MySQL Proxy stayed forever in alpha status with no more improvement and no visibility on its future.

So if you are interested in proxy technology and want to know more about MaxScale do not miss tomorrow webinar MariaDB Enterprise and MaxScale – why all the buzz? delivered by Ivan Zoratti.

To get MaxScale sources
To fill bugs.

MariaDB CONNECT Storage Engine as an ETL (or ELT) ?

The MariaDB CONNECT Storage Engine allows to access heterogeneous data sources. In my previous post I show you how to use the MariaDB CONNECT Storage Engine to access an Oracle database. This is quite easy through the CONNECT Storage Engine ODBC table type.

For most architectures where heterogeneous databases are involved an ETL (Extract-Transform-Load) is used to move data across the databases. Most ETL like Informatica, DataStage, Talend, Goldengate or Tungsten are tools that perform complex data transformations using middle-tier ETL engines. It require to describe the transformation process in a proprietary manner. This approach requires an extra product with extra expertise. Data is generally duplicated when it is moved around.

In many cases this extra product and competence requirement can be avoided. Data can be moved directly from database to database. Transformation can be directly apply inside the source or target database. This can be done through standard SQL commands.
This concept was successfully brought to the market by products like Sunopsis (bought by Oracle). This approach is known as ELT (Extract-Load-Transform) where data is moved directly from database to database. The database itself is used to transform data through standard SQL commands. This makes sense as some databases have very strong data transformation capabilities through powerful SQL extensions (analytical SQL)and parallel execution capabilities.

With the MariaDB CONNECT Storage Engine the approach of linking directly the two databases through an ODBC connection allows to implement this ELT pattern. Doing data transformation at the database level is very beneficial. All the advanced features of the source or target databases can be used.

In our case transformation will be applied both at the source and target database. Let us consider the following ODBC table created with the MariaDB CONNECT Storage Engine

create Table lineitem2   ENGINE=CONNECT TABLE_TYPE=ODBC
SRCDEF='select l_suppkey, sum(l_quantity) qt from dbt3.lineitem3 group by l_suppkey'
CONNECTION='DSN=orcl;UID=scott;PWD=manager1';

We can now query this table. All the aggregation specified in the SRCDEF argument will be done on the remote Oracle RDBMS node and not on the MySQL node. Some extra computation can be done on the local MySQL table. Here we compute the sum of the 10 biggest aggregated quantities.

MariaDB [dbt3]> select sum(qt) from (select qt  from lineitem2  order by qt DESC limit 10) as derivedT;
+-----------+
| sum(qt)   |
+-----------+
|     31444 |
+-----------+

This example is very simplistic but is here to give you an idea of whan can be achieved whith this pattern. So if you want to move data from an Oracle (or any ODBC accessible) database to MariaDB you should give the MariaDB CONNECT storage engine a try.

We already have a few customers using the MariaDB CONNECT Storage Engine to simplify their data integration processes (currently with Oracle and Sybase databases).

If you come to FOSDEM 2014 MySQL devroom on Sunday Feb 2nd. I wil present and will be available to answer your questions. if you are here do not hesitae to register to the MySQL Community diner Saturday Feb 1rst.

Slides on the MariaDB CONNECT Storage Engine

Let us know about your test. You just need to download MariaDB 10.0.7 which includes the CONNECT storage engine. Give us your feedback : positive, negative, improvement requests, bugs, doc pb, …

Download the latest MariaDB 10.0.7 Beta release
CONNECT Storage Engine Documentation

MariaDB CONNECT Storage Engine and non MySQL syntax selects

When you pass a SQL command through MySQL it has to be compliant with MySQL syntax as it will have to go through the MySQL parser. Can we pass non MySQL syntax compatible selects to a target ODBC database ? MariaDB CONNECT Storage Engine allows through ODBC to access heterogeneous data sources like Oracle or Microsoft SQL Server.
In my previous post I showed you how to use the CONNECT Storage Engine ODBC table type to access an Oracle database. It was possible to run select commands against these remote ODBC tables. In many cases It would be very interesting to run the sql command directly on the target database.
The first reason would be for efficiency as this would avoid a lot of network roundtrips between the Oracle server and the MariaDB Server. The second reason would be to avoid the restriction on MySQL syntax compliance.

With the MySQL CONNECT storage Engine it is possible to define a remote ODBC table and to associate a SQL request to this table definition. This SQL request will be executed on the remote server. For example for a remote Oracle database it is possible to pass a request using proprietary syntax. In the case of Oracle suppose we want to use the Oracle proprietary “CONNECT BY” syntax that does not exists in MySQL, we will do it that way:

create table emp_hierarchy  
ENGINE=CONNECT 
TABLE_TYPE=ODBC tabname='EMP' CONNECTION='DSN=orcl;UID=scott;PWD=manager1' 
srcdef='SELECT empno, ename, mgr, LEVEL FROM emp CONNECT BY PRIOR empno = mgr;';

The table definition includes a srcdef argument that defines the SQL request that will be executed on the Oracle target database. When we request this table on MariaDB everything is in fact done on the Oracle server.

MariaDB [test]> select * from emp_hierarchy;
+-------+--------+------+-------+
| EMPNO | ENAME  | MGR  | LEVEL |
+-------+--------+------+-------+
|  7788 | SCOTT  | 7566 |     1 |
|  7876 | ADAMS  | 7788 |     2 |
…
|  7839 | KING   | NULL |     1 |
…
|  7521 | WARD   | 7698 |     3 |
+-------+--------+------+-------+
43 rows in set (2.54 sec)
MariaDB [test]>

Beside this basic example this open the door to the ability to execute any select on the target database. This allows for example to take benefit of Oracle’s In-Database SQL Analytics which includes a lot of powerful functions(Ranking, Windowing, Reporting Aggregates, LAG/LEAD, FIRST/LAST, Inverse Percentile, Hypothetical Rank and Distribution, Pattern Matching, Modeling, Advanced aggregations..). The same idea applies to powerful spatial or full text capabilities.
And do not forget that with the MariaDB CONNECT Storage Engine you can aggregate many of these table located on different server to be seen as a single table. For that you use the CONNECT TBL table type. This also gives the benefit of parallel execution on these remote servers.

What would be great would be to have someone write a MySQL procedure wrapper that transparently create the CONNECT table to run the query against(something like “callRemoteSQL”) ;-) .

MariaDB CONNECT Storage Engine access to Oracle 11GR2

MariaDB CONNECT Storage Engine allows to access heterogeneous datasourses. This includes various file formats. But this also includes ODBC accessible datasources
The CONNECT Storage Engine ODBC table type allows to access SQLite, Excel, SQL Server or Oracle databases. Some nice features of the ODBC CONNECT table type are:
- Auto discovery of table structure. This means that you do not need to specify the columns of the target table.
- Condition push down. This is an optimization that push filtering conditions to the target database. This can avoid a lot of network roundtrips.

Let us have a look and test the Oracle access through the CONNECT Storage Engine. I downloaded, unzipped and installed Oracle 11gR2 in a virtual box.
Let us start Oracle 11gR2

[oracle@centos1 ~]$ sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on Fri Nov 29 15:15:59 2013
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
SQL> connect sys/manager1 as sysdba
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area  617975808 bytes
Fixed Size		    2215864 bytes
Variable Size		  377487432 bytes
Database Buffers	  230686720 bytes
Redo Buffers		    7585792 bytes
Database mounted.
Database opened.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Now let us start the oracle listener (this concept doest not exist in MySQL).

[oracle@centos1 ~]$ lsnrctl start
LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 29-NOV-2013 20:08:20
Copyright (c) 1991, 2009, Oracle.  All rights reserved.
Starting /u01/app/oracle/product/11.2.0/dbhome_1/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 11.2.0.1.0 - Production
System parameter file is /u01/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/diag/tnslsnr/centos1/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=centos1)(PORT=1521)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.1.0 - Production
Start Date                29-NOV-2013 20:08:20
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/centos1/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=centos1)(PORT=1521)))
Services Summary...
Service "orcl" has 1 instance(s).
  Instance "orcl", status UNKNOWN, has 1 handler(s) for this service...
Service "plsextproc" has 1 instance(s).
  Instance "plsextproc", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
[oracle@centos1 ~]$

On the machine with the MariaDB CONNECT Storage Engine we need to install the Oracle ODBC Driver. We obtain it through the installation of the Oracle instant client. Let us download and unzip the instant client. Installation is just unzip of the instant client zip files.

unzip instantclient-basic-linux.x64-12.1.0.1.0.zip
unzip instantclient-odbc-linux.x64-12.1.0.1.0.zip
unzip instantclient-sdk-linux.x64-12.1.0.1.0.zip

We need to add the path of the ODBC driver shared library. Let us ‘vi .bash_profile’ and add the required LD_LIBRARY_PATH.

CLIENT_HOME=/home/oraclec/instantclient_12_1 ; export CLIENT_HOME
LD_LIBRARY_PATH=$CLIENT_HOME:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH

On linux we also need to install the unixODBC open source ODBC driver manager :

yum install unixODBC
yum install unixODBC-devel

The configuration of the unixODBC driver manager is simple and just requires 2 files updates:

/etc/odbcinst.ini wich points the driver to the installed shared library

[Oracle 12c ODBC driver]
Description     = Oracle ODBC driver for Oracle 12c
Driver          = /home/oraclec/instantclient_12_1/libsqora.so.12.1

/etc/odbc.ini wich defines here the DSN that will be used for the ODBC connection

[orcl]
Driver       = Oracle 12c ODBC driver
ServerName   = //centos1.localdomain:1521/oracle
DSN          = orcl
UserName = scott
Password = manager1

How to test that the unixODBC + oracle ODBC driver is correctly installed ?
The unixODBC includes an ODBC client tool ‘isql’. Let us use it to test our installation and the DSN we just created.

root@centos1 ~]# isql orcl scott manager1
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL> select * from emp;
+-------+-----------+----------+-------+--------------------+----------
| EMPNO | ENAME     | JOB      | MGR   | HIREDATE           | SAL
+-------+-----------+----------+-------+--------------------+----------
| 7369  | SMITH     | CLERK    | 7902  | 1980-12-17 00:00:00| 800
…
+-------+-----------+----------+-------+--------------------+----------
SQLRowCount returns -1
15 rows fetched
SQL>

Now let us create a CONNECT table of ODBC type. we will make this MariaDB ODBC table point to the Oracle 11g ‘scott.emp’ table.

[root@centos1 ~]# mysql -u root -pmanager1 -S /usr/local/mariadb10data/mariadb10.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 10.0.6-MariaDB-log MariaDB Server
Copyright (c) 2000, 2013, Oracle, Monty Program Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]>
MariaDB [test]> create table emp  ENGINE=CONNECT TABLE_TYPE=ODBC tabname='emp' CONNECTION='DSN=orcl;UID=scott;PWD=manager1';
ERROR 1105 (HY000): Unsupported table type ODBC

Come on ! what happened ? I just installed the latest 10.0.6 MariaDB and the CONNECT storage Engine does not support the ODBC table type. That is true and that is an issue caused by old machine being used to build the binary the unnixODBC. This problem is not present in the MariaDB 10.0.6 rpm packages. But I have multiple tar based MariaDB version installed on my Centos box(in fact a virtual box vm) and I do not want any rpm based installation. Hopefully a storage engine is just a shared library.

Let us download MariaDB-10.0.6-centos6-x86_64-connect-engine.rpm and unpack the rpm without installing it :

[root@centos1 ~]# rpm2cpio ../MariaDB-10.0.6-centos6-x86_64-connect-engine.rpm | cpio --extract  --make-directories
8216 blocs
[root@centos1 plugin]# cp ./usr/lib64/mysql/plugin/ha_connect.so ha_connect.so /usr/local/mariadb10/lib/plugin/

We now have the CONNECT storage with the support of the ODBC table type. Our second try :

[root@centos1 local]# mysql -u root -pmanager1 -S /usr/local/mariadb10data/mariadb10.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 10.0.6-MariaDB-log MariaDB Server
Copyright (c) 2000, 2013, Oracle, Monty Program Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> use test
Database changed
MariaDB [test]> create table toto  ENGINE=CONNECT TABLE_TYPE=ODBC tabname='emp' CONNECTION='DSN=orcl;UID=scott;PWD=manager1';
ERROR 1105 (HY000): [unixODBC][Driver Manager]Can't open lib '/home/oraclec/instantclient_12_1/libsqora.so.12.1' : file not found

?
I had forgotten that the mysql service is run under the mysql linux user. This user also needs to have access to the Oracle ODBC driver (libsqora.so.12.1 shared library provided by the Oracle instant client).

CLIENT_HOME=/home/oraclec/instantclient_12_1 ; export CLIENT_HOME
LD_LIBRARY_PATH=$CLIENT_HOME:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH

Once done everything should goes better:

MariaDB [test]> create table toto  ENGINE=CONNECT TABLE_TYPE=ODBC tabname='EMP' CONNECTION='DSN=orcl;UID=scott;PWD=manager1';Query OK, 0 rows affected (0.79 sec)
MariaDB [test]>
MariaDB [test]> select * from toto;
+-------+--------+-----------+------+------------+---------+---------+--------+
| EMPNO | ENAME  | JOB       | MGR  | HIREDATE   | SAL     | COMM    | DEPTNO |
+-------+--------+-----------+------+------------+---------+---------+--------+
|  7369 | SMITH  | CLERK     | 7902 | 1980-12-17 |  800.00 |    NULL |     20 |
…
|  7934 | MILLER | CLERK     | 7782 | 1982-01-23 | 1300.00 |    NULL |     10 |
+-------+--------+-----------+------+------------+---------+---------+--------+
14 rows in set (0.09 sec)
 
MariaDB [test]> insert into toto values ( 8000,'serge','peintre',7902,'1980-12-17',123,234,20);
Query OK, 1 row affected (0.15 sec)
 
MariaDB [test]> select * from toto;+-------+--------+-----------+------+------------+---------+---------+--------+
| EMPNO | ENAME  | JOB       | MGR  | HIREDATE   | SAL     | COMM    | DEPTNO |
+-------+--------+-----------+------+------------+---------+---------+--------+
|  8000 | serge  | peintre   | 7902 | 1980-12-17 |  123.00 |  234.00 |     20 |
|  7369 | SMITH  | CLERK     | 7902 | 1980-12-17 |  800.00 |    NULL |     20 |
…
|  7934 | MILLER | CLERK     | 7782 | 1982-01-23 | 1300.00 |    NULL |     10 |
+-------+--------+-----------+------+------------+---------+---------+--------+
15 rows in set (0.03 sec)

That looks nice. I have created an ODBC table pointing to a remote Oracle 11g table. That was quite simple as the auto discovery mechanism has automatically discovered the structure of the Oracle table and converted Oracle data types to compatible mysql column type.
Let us try to update Oracle data through MySQL :

MariaDB [test]> update toto set sal=999 where empno=8000;
ERROR 1296 (HY000): Got error 174 'No DELETE/UPDATE of ODBC tablesd' from CONNECT

This is not supported ! That is right. This ODBC table update feature has been developed and is currently only available in the launchpad repository of the CONNECT Storage Engine.
CONNECT Storage Engine Sources on launchpad

MariaDB CONNECT Storage Engine replay & slides available

The slides and replay of yesterday’s webinar on the MariaDB CONNECT storage engine have just been posted. First I want to thank the numerous attendees.
You have shown great interest on the parallel execution of query on distributed MySQL Servers. I agree this is cool.
The ODBC capabilities seems also to generate interest. This make it simple to access an ODBC datasource (SQLServer, Oracle …) from plain MySQL syntax.

Here to view the replay
Here to get the slides

Let us know about your test. You just need to download MariaDB 10.0.5 which includes the CONNECT storage engine. Give us your feedback : positive, negative, improvement requests, bugs, doc pb, …

Here to download the latest MariaDB 10.0.5 Beta release
Here to get the CONNECT Storage Engine Documentation