Skip to content

In my previous post "InnoDB : Why not use the same isolation level as ORACLE" I suggested that like ORACLE it might now be a good idea to use READ COMMITTED as the default isolation level for InnoDB.

Marc Callaghan made interesting measures that illustrate a case where READ COMMITTED mode degraded performance.
("Repeatable read versus read committed for InnoDB ")

Can we draw conclusions from what has been measured ?

Let us look in more detail to what has been measured. The workload is generated by sysbench :
...continue reading "InnoDB : Any real performance improvement when using READ COMMITED isolation level ?"

By default InnoDB uses REPEATABLE READ as its isolation level. So this is the isolation level used with innoDB by almost all MySQL users.

The default isolation level of Oracle is READ COMMITTED. READ COMMITTED is the mode widely used by Oracle users. This mode incurs less penalty on the server scalability by allowing to support more concurrency.

The reason why InnoDB use REPEATABLE READ as its default is historical. This is a related to the way MySQL replication was developed . MySQL replication until 5.1 functioned with a statement based replication mechanism. This means that statements that occurs on the master server are replayed on the slave server. The statement base replication mode does not permit to use the READ COMMITTED isolation level. In that case replication will not guaranty consistency between the slave and the master.

The MySQL widely used statement based replication has some major drawbacks. The first one is that to guaranty the same effect of statements on slave as on master innoDB need to be careful when generating id through the autoincrement mechanism. The second impact is that InnoDB also needs to avoid phantoms to guaranty the same effect of a statement on the master and slave.

To generate consecutive autoincrement id during a multi rows insert a table level lock is necessary.

To handle phantom issue InnoDB has implemented a mechanism that consist in locking gaps.

UPDATE ... WHERE or a DELETE ... FROM ... WHERE requires InnoDB to set an exclusive next-key lock on every record the search encounters. This can dramatically impact concurrency on the server by forbidding inserts into the gaps.
INSERT ... SELECT will put a shared lock on all the selected rows. This basically prevent any update on the selected rows. This also seriously impact concurrency. This case is very common when reports are run on an OLTP server.
SELECT ... FROM ... FOR UPDATE sets exclusive next-key locks on all index records the search encounters and also on the corresponding clustered index records if a secondary index is used in the search.
SELECT ... FROM ... LOCK IN SHARE MODE sets shared next-key locks on all index records the search encounters

Oracle does not behave that way. The Oracle replication mechanism is a row based mechanism. None of the blocking locks mentioned above with InnoDB will happen with Oracle.

Can we have with MySQL the same behavior as Oracle users ?

Yes we can ! Since version 5.1 MySQL offers a row based replication mode closed to what ORACLE is doing. This basically allow to run MySQL in READ COMMITTED isolation level. This can really benefit to MySQL performance and scalability. To achieve that it is mandatory to use Row based replication. This suppress most restrictions related to MySQL statement based replication. The READ COMMITTED isolation level relax most of the locking issues impaired by the REPEATABLE READ MODE. This seems anyway to be used rarely by MySQL 5.1 users !

I have no figures to prove performance improvement impacts of READ COMMITED isolation level with InnoDB,
so you should give it a try ! 🙂

My colleagues Luca Olivari and Anders Karlsson mentioned the importance of an ecosystem, open source or not.
I could not agree more.

One of the fastest growing open source ecosystem is currently the Android ecosystem.
Android also has a market place. Semantically, by the way there is a difference between a store (like Apple App Store) and a market 🙂
The Mobile O/S Percentage Gain for Nov/Dec is : +54,8% for Android, +20.1% for iPhone, and +19.0% for Symbian.
So Open Source definitely Rocks ... and as for benchmarks it is always better to have numbers to prove it.
The access to the Internet through smart phones is now 1,3% of all the accesses when it was 0.57 ten month ago, which might lead to an annual growth of more than 180%.
And if you can NOT view Flash-based pages with an iPhone on Android you can (almost, promised for first half 2010)!

On the Android, applications are developed in Java ans run through a specific Virtual Machine. I heard our jdbc driver does not work on Android because of the partial implementation of J2SE on Android. Anyone tried ? I know it does not make much sense to do client/server from a phone to a remote database.
Android uses Sqllite as its embedded database which is nice. But when it comes to central servers main usage is MySQL.
This move toward smart phone access to the Internet is certainly going to dramatically increase the use of relational databases.

All the numbers come from http://marketshare.hitslink.com/report.aspx?qprid=61&sample=13

ezNcrypt offers a table level transparent data encryption solution for MySQL. This technology is purely declarative which mean you declare tables or database you want to encrypt. You then have nohing more to care about. What is nice with ezncrypt is that only the mysqld process can encrypt/decrypt the data. A set o UDF functions have been added to handle that. The key management allows to store the key locally or remotely.

Critotech trough ezNcrypt offers a table level transparent data encryption solution for MySQL.

This linux technology is purely declarative which mean you declare tables or database you want to encrypt.
You then have nothing more to care about.

What is nice with ezncrypt is that only the mysqld process can encrypt/decrypt the data. A set o UDF functions have been added to handle that. The key management allows to store the key locally or remotely.
Using table level transparent data encryption can be an interesting solution for many websites. Encrypting a few tables is often enough and less costly than encrypting the whole database. Most of the time partial encryption makes the whole data totally unusable. Many websites are hosted and data asset protection is a key worry.

Of course there exists many other solutions like file system level encryption or trigger based encryption.
File system level encryption put more burden on the CPU as all data is encrypted. A trigger based encryption solution is quite intrusive and does not solve all security issues.
When implementing encryption it is mandatory to also care about encryption of innodb recovery logs and and mysql binary logs.
It is also necessary to be careful about temporary tables pushed to disk. Swap space encryption should also be taken into account.

Having tested ezncrypt I must confess it is quite simple to setup. It still needs some improvement to be more flexible when dealing with complex configuration : multiple BASEDIR and DATADIR for people having multiple versions and instances of MySQL on the same platform.

Example of database encryption : encrypting the eztest database

[root@serge mysql]# db_encrypt eztest
ezncrypt | Checking system dependencies
mysql | getting information about database file location
...........|> using /var/lib/mysql
...........| Please provide a MySQL username & password
...........| Enter username: root
...........| Enter password:
...........| Looking if database exists
...........| done!
keymgr | Retrieving passphrase from KSS
...........|> Encryption password retrieved from KSS
...........| generating keys
...........| done!
ezncrypt | checking database encryption status
...........| done!
ezncrypt | checking disk space
...........| done!
...........| WARNING: MySQL will be stopped while encrypting data. Continue? (Y/n)
backup | backing up data
...........| This can take a while. Please be patient
...........| Executing /opt/mysql/bin/mysqlhotcopy to backup 'eztest'
...........|> /backup/2009-12-23/eztest
...........| done!
mysql | stopping mysql service
...........| done!
ezncrypt | begin database protection
...........| moving data to encryption directory
...........| This can take a while. Please be patient
...........|> /var/lib/ezncrypt/ezncrypted/mysql/eztest
...........| done!
mysql | starting mysql service
...........| done!
mysql | adding key to mysql process
...........| done!
ezncrypt | congratulations. you have a database ready for ez-encryption!
...........Log Information: /var/log/ezncrypt/db_encrypt.log

ezNcrypth is based on very standard linux technology like ecryptfs

What more frustrating than full table scan wiping out the buffer pool. You have a nicely tuned busy OLTP server with a warm buffer pool containing the current working set. Then someone submit a report needing to access a table through a full table scan. The normal and current MySQL behavior is to wipe out the content of the cache. if the table is never reused this is pure loss. The server will have to go through a full new warm up phase that can last quite long with a big buffer pool. This issue is now solved with MySQL 5.1.41.

What more frustrating than full table scan wiping out the buffer pool. You have a nicely tuned busy OLTP server with a warm buffer pool containing the current working set. Then someone submit a report needing to access a table through a full table scan. The normal and current MySQL behavior is to wipe out the content of the cache. if the table is never reused this is pure loss. The server will have to go through a full new warm up phase that can last quite long with a big buffer pool. This issue is now solved with MySQL 5.1.41.

The latest innodb plugin 1.0.5 included in MySQL 5.1.41 introduced the settable global variables innodb_old_blocks_pct and innodb_old_blocks_time. These two variables allow to control the buffer pool eviction policy. This makes it possible to tune the buffer pool LRU eviction policy to be more resistant against full scans. This 1.0.5 innoDB plugin is the latest version of the innodb plugin. It is included in MySQL 5.1.41 and more up to date than the plugin downloadable from innobase.
To test this new functionality we first have modified our .cnf file. This is necessary to enable the InnoDB plugin and the InnoDB Information Schema tables implemented in ha_innodb.so. Currently this is not done by default in MySQL 5.1.41 which by default use an old embedded version of innoDB not including all the fancy functionalities of the plugin. We have to wait for MySQL 5.4 to have the embedded innoDB use the new innoDB plugin code.

my.cnf :

ignore_builtin_innodb
plugin-load=innodb=ha_innodb.so;innodb_trx=ha_innodb.so;
innodb_locks=ha_innodb.so;innodb_lock_waits=ha_innodb.so;
innodb_cmp=ha_innodb.so;innodb_cmp_reset=ha_innodb.so;
innodb_cmpmem=ha_innodb.so;innodb_cmpmem_reset=ha_innodb.so

Let us consider a MySQL instance with a rather small buffer pool :

innodb_buffer_pool_size = 20M
innodb_flush_method = O_DIRECT # this is to avoid filesystem cache
innodb_file_per_table

First we issue a request for which all data fit in the buffer pool(dbt2 9M new_order table) :

mysql> select sum(no_w_id) from new_order;
+--------------+
| sum(no_w_id) |
+--------------+
| 54000 |
+--------------+
1 row in set (0.18 sec)

We reissue it and now benefit from the buffer pool :

mysql> select sum(no_w_id) from new_order;
...
1 row in set (0.02 sec)

We then issue a query that do a full table scan on a quite big table (dbt2 125M stock table) that does not fit in the buffer pool. To allow a full scan to fill the buffer pool we set :

set global innodb_old_blocks_time=0;
set global innodb_old_blocks_pct=90;

mysql> select sum(s_quantity) from stock where s_data like 'a%';
+-----------------+
| sum(s_quantity) |
+-----------------+
|769958 |
+-----------------+
1 row in set (4.58 sec)

We then reissue the initial query which gets slow as the data has been wiped out of the buffer pool :

mysql> select sum(no_w_id) from new_order;
...
1 row in set (0.20 sec) <=========== back to slow time

The second run of course is faster as the data is back in the cache :

mysql> select sum(no_w_id) from new_order;
...
1 row in set (0.03 sec) <=========== back fast time

If we want to keep data in the buffer pool and avoid full scan to wipe out the buffer pool we set :

set global innodb_old_blocks_time=5;
set global innodb_old_blocks_pct=5;

When we issue the small query it is fast :

mysql> select sum(no_w_id) from new_order;
...
1 row in set (0.03 sec) <========== fast

We then issue the request that does a full table scan and that previously wiped out the small table out of the buffer pool :

mysql> select sum(s_quantity) from stock where s_data like 'a%';
...
1 row in set (4.34 sec)

We then reissue the initial small query and it remains fast. The data has not been wiped out of the buffer pool :

mysql> select sum(no_w_id) from new_order;
...
1 row in set (0.02 sec) <=========== REMAINS FAST / DATA NOT WIPED OUT OF INNODB BUFFER POOL:-)

How does it now works ?

InnoDB Plugin 1.0.5 does now provides two system variables that enable LRU algorithm tuning:
The buffer pool is divided into new blocks and old blocks.
innodb_old_blocks_pct : Specifies the approximate percentage of the buffer pool used for the old block sublist. The range of values is 5 to 95. The default value is 37 (that is, 3/8 of the pool).
innodb_old_blocks_time : Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist. The default value is 0: A block inserted into the old sublist moves immediately to the new sublist the first time it is accessed, no matter how soon after insertion the access occurs. If the value is greater than 0, blocks remain in the old sublist until an access occurs at least that many ms after the first access. For example, a value of 1000 causes blocks to stay in the old sublist for 1 second after the first access before they become eligible to move to the new sublist.

Beware that giving a too high value could also be inefficient as the block will certainly be wiped out of the old sublist before having a chance to enter the new sublist. There is an equilibrium to find between a cache that get wiped out and one that never change ! This needs some monitoring.

How to monitor ?

To monitor and get a better insight of what happens in the buffer pool the show innodb status command will provide some extra info :

Database pages 1279
Old database pages 62

Pages made young 39508, not young 185363
121.26 youngs/s, 2942.48 non-youngs/s
Pages read 64769, created 0, written 1
467.03 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 889 / 1000, young-making rate 29 / 1000 not 704 / 1000

Pages made young is the total number of old pages that have been made young.
youngs/s is the rate at which page accesses to the old pages have resulted in making such pages young since the last invocation of the command.
young-making rate provides the same rate but in terms of overall buffer cache accesses instead of accesses just to the old pages.
These information give an idea at what rhythm the bufferpool is renewed and help tune innodb_old_blocks_pct and innodb_old_blocks_time to reach a target goal.

As noticed by Mark Callaghan this functionality is a nice feature to have when you occasionally use mysqldump to copy a table from a busy OLTP server.

InnoDB plugin definitely gets better and better. It is a good reason to upgrade to MySQL Server 5.1.41 and activate the InnoDDB plugin.

France FORUM PHP 2009 was held last week in Paris (12th and 13th nov) at Cite des sciences et de l'industrie Thanks to LeMuG (the french MySQL User Group) MySQL has had a nice set of presentations : Michael Widenius aka Monty ( Monty Program Ab ) presented "MariaDB, the future of MySQL". Johannes Schluter (MySQL connectors dev team + release manager of PHP 5.3) has presented "PHP and MySQL - A good match" This was the closing presentation for the 2 days but audience stayed to hear about the nice functionalities of the mysqlnd driver. A lot of interest for the new features (some experimental) : persistent connections, performance statistics, asynchronous request, client side cache, streams + filter, statistics collection, PDO + mysqlnd good match.

France FORUM PHP 2009 was held last week in Paris (12th and 13th nov) at Cite des sciences et de l'industrie
Thanks to LeMuG (the french MySQL User Group) MySQL has had a nice set of presentations :

Johannes Schluter (MySQL connectors dev team + release manager of PHP 5.3) has presented "PHP and MySQL - A good match"
This was the closing presentation for the 2 days but audience stayed to hear about the nice functionalities of the mysqlnd driver. A lot of interest for the new features (some experimental) : persistent connections, performance statistics, asynchronous request, client side cache, streams + filter, statistics collection, PDO + mysqlnd good match.

Michael Widenius aka Monty ( Monty Program Ab ) presented "MariaDB, the future of MySQL".

Slattery Zoe spoke about "TestFest - a new model for open source development ? "
Couzy Pierre (Microsoft France) presented "Best support, Cloud, and Open Source : status of PHP and Microsoft"
I presented MySQL and PHP evolution ( mysqli / myslnd and all the tips and tricks to develop in PHP+MySQL )
Olivier Dasini (LeMug) presented a "Return of experience for MySQL and PHP at Orange Business Services"
Seguy Damien (Alter Way Consulting) talked about "The future of LAMP"
Raphael Rougeron (Linagora) presented "RESTful web Services in PHP"
Noel Olivier & Bostral Ludovic (M6) presented "Migration from J2EE technology to PHP at M6Web"
Patrick Allaert ( eZ Systems) / Davide Mendolia (Ausy) presented "PHP applications monitoring: APM and Pinba"
Ganem Charles and Julien Zamor (Digiteka) presented "Ultimedia & JukeBo 2.0"
Hoareau Olivier ( PHPPRO) & Damien Seguy (Alter Way Consulting) presented "Yes ! PHP is industrial !"

...

These 2 days were very interactive with a nice part for MySQL. In particular the usage of frameworks like Zend Framework or symfony in conjunction with MySQL was positioned to produce more industrial and maintainable PHP+MySQL code. Long live the LAMP stack !

Other presentations : full agenda