Category: mysql

FOSDEM 2013 : All the MySQL ecosystem together

SergeFebruary 6, 2013Leave a comment

This was nice to see all the MySQL ecosystem in the same very crowded room :
Oracle, Percona, MariaDB, SkySQL. We definitely need this kind of open event. MySQL Connect (owned by Oracle) or Percona Live (owned by Percona) are 2 great events but they do not gather the same variety of speakers and attendees. Thanks to the FOSDEM MySQL Dev room review committee the agenda reflected this openness.

The "MySQL and friends dev room" was crowded which was for me a surprise as I expected most people to be at NoSQL or Cloud topics rooms ;-). This reflect the strong attraction of MySQL with its constant innovation momentum. The challenge for presenters was the short 30 mns format. This has a big advantage that speakers have to get rid of marketing bullshit and concentrate on key points.

Collin Charles gave a pres on MHA solution. It nice to hear that it is a very active project. It is a proven technology used at Facebook. It allows very versatile usage from a manual switch-over to fully automatic failover. It can be combine with a clusterware like pacemaker. This is a technology we fully support at SkySQL and that we have deployed successfully at many customers.

Peter Boros gave a very interesting talk on an ongoing project at the Groupon company. The solution address the automatic warmup of a slave server during binlog apply. it is mix of a tool that read and stream the slow query log with the percona playback tool to warmup slave. I definitely would like to know more. I really like when people address an old problem with a new creative approach. The part streaming the slow query log should be open sourced soon.

Lars Thalmann presented a full picture of Oracle innovation around MySQL. There was a nice question about why Oracle which has released a migration toolkit does not address Oracle migration. Lars was surprised too 🙂 But I think it is fair to leave this part be developed by the community ;-). Lars presented all the progress made around windows platform. They are impressive. This is a good point as Windows remain important as the entry point for adoption, development and testing.

Luis Soares who manage the replication at Oracle described in detailed the new features of 5.6. He gave a workaround on how to implement multi master replication.
MariaDB will offer natively this feature (Thanks to TAOBAO contribution).

Stephane Combaudon had the task to chose the 5 tools he considered the most useful in the percona toolkit. The nominees were :
pt-query-digest, pt-online-schema-change, pt-table-checksum, pt-stalk, pt-archiver
Stephan insisted on the fact that it is really important to understand how the tools work to avoid damaging mistakes. This is in particular true for pt-archiver which is a nice tool to purge old data in MySQL.

Maciej Dobzranski gave a talk on MySQL and security. Tis subject is a hot one. The key take away is "MySQL is not secure out of the box! Many users just leave it at that.". A lot of tips and valuable examples of security challenges faced with MySQL.

When I read the schedule I saw there was no pause for lunch for the MySQL Dev Room ! Unacceptable for a french guy 😉 ! I saw Justin Swanhart pres title "Divide and conquer in the cloud" I thought I might use this slot to go as I expected classic marketing about how the cloud is going to save our life. I bought my sandwich and run back to the MySQL Dev Room as I wanted to have for my money (kidding : FOSDEM is free with no need for registration 😉 and the beer is cheap).

Justin Swanhart was presenting "Shard-Query" which is a framework that add parallelism and improve query performance based on SQL language constructs. This is a great tool. Some nice ideas mentioned like using Shard-Key-Mapper in conjunction with mysqlnd plugins (PHP connector) . Justin gave example of query speedup for a SUM aggregate. He mention that MEAN and STDDEV were not candidate to parallelization. I think it can be parallelized in a multi pass approach. I will blog on it later.

Other interesting pres by Andrew Morgan, Sveta Smirnova, Oystein Grovlen, Giuseppe Maxia, Henrik Ingo, Raghavendra Prabhu.

I am sure that next year FOSDEM MySQL Dev Room will have more people as this is a really great event. MySQL remain a hot topic with a lot of innovation to come.
FOSDEM is definitely the right place for all the MySQL community to gather !

Next week MySQL Innovation Day has a great agenda

SergeMay 31, 2012Leave a comment

MySQL developers will cover upcoming features. The MySQL engineering team is driving MySQL forward and users feedback is welcomed. For those having tested the Development Milestone releases this is a good opportunity to share your feedback and ideas.

The agenda includes :

InnoDB online operations

Replication :
Global transaction IDs for replication and failover
Multi-Threaded Slaves and Group Commit

NoSQL interfaces to InnoDB and NDB

Performance :
PERFORMANCE_SCHEMA improvements
Optimizer enhancements

Best Practices for Using MySQL in the Cloud

It is a physical event in Redwood Shores on Tuesday, June 05, 2012 8:00 AM – 5:30 PM
But there is also a live webcast option ! Register
I will definitely follow it.

MySQL for a Massively Parallel Database ?

SergeOctober 15, 2010Leave a comment

Last week in Paris I wen to Microsoft days show. It was a big event with steve Balmer, mainly focused on Cloud strategy, Window Phone 7 launch and other stuff. But beside that marketing part there was some interesting technical sessions. I attended one on the Microsoft Parallel Datawarehouse Solution (result of the madison project based on Datallegro acquisition).
This is a shared nothing architecture. Big tables data is distributed across node, smaller tables are replicated across node to avoid traffic network during join. The query is first issue on a coordinating node that then push pieces of it to the other nodes.
This is the same architecture that Postgres has with the greenplum solution.
For MySQL I do no know of similar architecture.
...continue reading "MySQL for a Massively Parallel Database ?"

Using ioprofile with MySQL is cool !

SergeOctober 12, 2010Leave a comment

Tracing per file IOs is a mandatory task for anyone doing performance issue analysis on a MySQL database. This could help to find candidates for innodb table compression, to decide to move data around on disks to optimize IOs, to decide what needs caching ...
There is no direct command to do it on linux. iostat gives you IOs per device or logical volumes.(I once thought of having loopback device mounted on my innodb datafile but it does not work).

ioprofile is a smart shell script developed by Baron Schwartz with the following principle :
...continue reading "Using ioprofile with MySQL is cool !"

When should I use Oracle 11g and when should I use MySQL ?

SergeSeptember 29, 2010

This is a real question asked by Ivan Zoratti in his blog post.
Another frequently heard question is : Should I still use MySQL now that my business is turning into a very serious business or should I switch to Oracle 11G.

To illustrate this question let me talked about MySQL users I met last week.
Last week I went to a big e-commerce exhibition in Paris ( 3 days 27.500 visitors)

The e-commerce | m-commerce business is a booming market segment. It involves various activities : digital marketing, price comparators, web analytics, social commerce ...
But it also includes very critical areas like logistics, payment, monetization ... This business is now representing more than 100 Billions both in US and EU.

So I questioned many company in various area. I must confess that all of the one I met relied on MySQL except one that was using Microsoft + SAS ( the statistics tool to do market segmentation). None of them had really chosen MySQL vs Oracle. They chose MySQL when there were small start-ups companies and could not afford Oracle.
In fact many of them still use MySQL community (GPL product) for free.

But now they realized that their business is becoming a serious one generating nice turnover with profits and excellent growth perspective.
As volume, traffic, criticity increase many are worried about the ability of MySQL to fit with their growing business.
Will MySQL be technically able to handle the volume, the high transaction level, the high availability ? Will it technically allow to maintain a 100% on-line business.

They have been fully reassured by the commitment of Oracle to invest in MySQL and make it a better database.
The latest announces about MySQL 5.5 improvement in performance and scalability made them confident they made the right choice.
The improvement on replication reliability, the availability of online backup with MySQL, the improvement on partitioning, all this made them feel Oracle is truly making MySQL better.

They have been also totally convinced by the very lively and innovative MySQL community : spider storage engine , flexview materialized views, Qgraph storage engine, column based storage engine, smart patches or improvement developed by key users. They got the feeling that this is the place where things are happening and innovation is taking place openly.

The capacity to replicate from Oracle to MySQL through Goldengate made them confident they could integrate with more classic IT if necessary.

So for me the main question that was answered by Oracle for these customers was : Should I still rely on MySQL now that I am doing big business ?
and the answer is obviously : YES you can rely on MySQL.

Nice BTRFS webinar by Oracle

Last week I followed an very interesting ORACLE webinar delivered by Chris Mason : The State of Btrfs File System for Linux
BTRFS was initiated by Chris Mason who used to be responsible for Reiserfs at Suse and now works for Oracle. The first release started in 2007. BTRFS has been merged into Linux kernel in 2009. Now there are developers from REDHAT, INTEL SUSE, IBM, HP ... storage vendors. The project is very active. Ubuntu is considering to use it soon as its default filesystem. BTRFS is licensed under the GPL license. An interesting to read short summary of the life of BTRFS : A short history of BTRFS
...

SergeAugust 31, 2010Leave a comment

Last week I followed a very interesting ORACLE webinar delivered by Chris Mason : The State of the BTRFS File System for Linux

BTRFS was initiated by Chris Mason who used to be responsible for Reiserfs at Suse and now works for Oracle. The first release started in 2007. BTRFS has been merged into Linux kernel in 2009. Now there are developers from REDHAT, INTEL SUSE, IBM, HP ... storage vendors. The project is very active. Ubuntu is considering to use it soon as its default filesystem. BTRFS is licensed under the GPL license. An interesting to read short summary of the life of BTRFS : A short history of BTRFS
...continue reading "Nice BTRFS webinar by Oracle"

Solutions Linux / Open Source 2010

Last week Solutions Linux / Open Source event was held in Paris.

Kuassi MENSAH (Head of Product Management Database Technologies, Oracle Corporation) presented the open source Oracle strategy. Linux, MySQL, virtualization, GlassFish, Eclipse, dynamic scripting languages ,... etc . It was well received by the audience. Knowing that MySQL organization will be kept safe in Oracle is perceived as a nice move.

Florian Haas(LINBIT) gave a tutorial on DRBD and did some demos with NFS and video streaming. And of course he reminded people that now since Linux 2.6.33, DRBD is officially integrated into the Linux kernel source. DRBD making the push for mainline Linux kernel is going to make HA easier.

...

SergeMarch 23, 2010Leave a comment

Last week Solutions Linux / Open Source event was held in Paris.

Kuassi MENSAH (Head of Product Management Database Technologies, Oracle Corporation) presented the open source Oracle strategy. Linux, MySQL, virtualization, GlassFish, Eclipse, dynamic scripting languages ,... etc . It was well received by the audience. Knowing that MySQL organization will be kept safe in Oracle is perceived as a nice move.

Florian Haas(LINBIT) gave a tutorial on DRBD and did some demos with NFS and video streaming. And of course he reminded people that now since Linux 2.6.33, DRBD is officially integrated into the Linux kernel source. DRBD making the push for mainline Linux kernel is going to make HA easier.

As is normal now, some subjects are getting stronger :
The cloud was a hot topic with a good keynote by UBUNTU. Virtualization related to cloud architecture was also presented by hosting providers.
Extreme scaleout with Hadoop (Yahoo implementation of Google Map reduce) get a lot of interest.
NoSQL databases solutions was also a hot subject. I must confess that I was skeptical about broad usage of NoSQL beside the top web actors.
But this times I met my first customers talking seriously about it and doing serious testing with Cassandra and Voldemort.
So I decided to follow a tutorial on using Hadoop. very interesting.

Microsoft was gold sponsor of the event. Microsoft presented themselves as now as a good friendly contributor to the Linux kernel. Beside that they try to optimize all open source solutions running on Windows Server. And of course they want to play a key role in the new cloud economy. In their Azure cloud framework they can handle Linux, PHP, MySQL . They seem to sincerely want to be active and friendly in the open source segment.

There was a full set of tutorials on database technology. Postgresql, NoSQL technologies, Ingres, PostGIS were there.
I presented a tutorial on MySQL. Our MySQL users and customers are always as passionate. They are eager to use our next production releases. Many of them have started using InnoDB pluggin and its advanced capabilities (compression, fast index rebuild ...). Some have started testing MySQL 5.5 and are very impressed by all the performance gains + extra functionalities. MySQL is really on it way to become "the fasted open source commodity database on the market". I of course got some questions on when we will release GA version. MySQL proxy GA was also mentioned and asked for.

Why do we need InnoDB per table IO statistics ?

InnoDB plugin offers an adaptative compression that is very interesting for performance.
This compression works at a table level. Like most optimization techniques it is not black or white.
You have to decide depending on IO patterns / CPU usage / Memory usage / disks constraints what tables are good candidates for compression.
I was trying to define a methodology to help decide which tables to compress to reach performance gains.
I thought I could use MySQL "show status" command or information_schema tables. Unfortunately MySQL does not offer any per tables IO statistics.

SergeMarch 1, 2010Leave a comment

InnoDB plugin offers an adaptative compression that is very interesting for performance.
This compression works at a table level. Like most optimization techniques this is not black or white.
You have to decide depending on IO patterns / CPU usage / Memory usage / disks constraints what tables are good candidates for compression.

I was trying to define a methodology to help decide which tables to compress to reach performance gains.

I thought I could use MySQL "show status" command or information_schema tables. Unfortunately MySQL does not offer any per tables IO statistics.
I thought maybe I could use some linux command. Unfortunately on linux you can only get statistics at a block device level with iostat command.
As a last ressort I thought maybe I could use MySQL Enterprise Monitor / Query Analyzer.
The MySQL Query Analyzer gives you nice aggregated statistics per queries : nb execs / mean time ...
All of these statistics are at a query level. Most queries are joins, and many tables are accessed through multiple different queries. So you get no information at a table level.
So none of the above techniques solve the issue.

Mark Callaghan had packaged a set of patch for MySQL 5.0.
This set of patches includes IO statistics per InnoDB tablespace. In conjunction with the use of the innodb_file_per_table option this gives InnoDB per tables IO statistics.

A good reason to use inodb_file_per_table -- per-table IO statistics

More changes for performance monitoring

The SHOW INNODB FILE STATUS command reports IO statistics per InnoDB tablespace and thus per table When innodb_file_per_table is used.

The InnoDB plugin that offers compression is part of MySQL 5.1. and this patch was for MySQL 5.0 only.

But a very good news came last friday :
Marc Callaghan has just released the Facebook patch for MySQL 5.1 .
This gives us a efficient tool to decide which tables are good candidate for compression.

If anyone has time to write a script that would based on these statistics and on a server workload pinpoint the tables best canditates for compression.
This would be nice. I agree this would be easier if these statistics were part of the information schema. So if someone could do that too this would be definitely a great
contribution.

What will I remember about this week ?

I went to Microsoft techdays : huge 17000 visitors, 300 confs , .... C#, F#, Sqlserver 2008, cloud ...
I took part to the "Android Developer Labs World Tour" in a place called "La cantine" ( a french cafe + some chairs)
A few geeks there, a very nice event . Each participant went back home with a nexus one !
I delivered a MySQL and java pres. I had fun demoing JPA, JSF, RestFull web services on Glassfish.
But what I will definitely remember about this week is this news :
"Ken Jacobs leaving Oracle"
You will be missed by Oracle
You will be missed by MySQL community
You have been a key people of Oracle helping design the product that made all the rest possible for Oracle. You have always played fair with the MySQL community. In my previous life at Oracle I remember 7.0, 7.1 , 8.0, 8i, 9i, 10g, 11g .... You sometimes came to Paris and gave us not just the lengthy list of new functionalities but the reason why it was build that way, the internals and the interest it represents. When you took InnoDB I was surprised by the dedication you put in this wonderful piece of software. But you were right. Technology is just about details and elegance. You brought a lot of insight to the MySQL community. And do not forget that the MySQL community is not a company. You cannot resign. You can take your distance, do whatever please you, come back. You will always be welcomed.

SergeFebruary 12, 2010

But what I will definitely remember about this week is this news :

"Ken Jacobs leaving Oracle"

You will be missed by Oracle.
You will be missed by MySQL community.

You have been a key people of Oracle helping design the product that made all the rest possible for Oracle.
You have always played fair with the MySQL community.

In my previous life at Oracle I remember 7.0, 7.1 , 8.0, 8i, 9i, 10g, 11g ....
You sometimes came to Paris and gave us not just the lengthy list of new functionalities but the reason why it was build that way, the internals and the interest it represents.
When you took InnoDB I was surprised by the dedication you put in this wonderful piece of software.
But you were right. Technology is just about details and elegance.
You brought a lot of insight to the MySQL community.

And do not forget that the MySQL community is not a company. You cannot resign.
You can take your distance, do whatever please you, come back.

You will always be welcomed.

Using UNIX_TIMESTAMP as a partitioning function for MySQL 5.1.43

It is now possible to use the UNIX_TIMESTAMP() function in partitioning expressions using TIMESTAMP columns.
MySQL 5.1.43 can be downloaded from
http://dev.mysql.com/downloads/
For example, it now possible to create a partitioned table such as this one :

SergeFebruary 1, 2010Leave a comment

It is now possible to use the UNIX_TIMESTAMP() function in partitioning expressions using TIMESTAMP columns.

MySQL 5.1.43 can be downloaded from

http://dev.mysql.com/downloads/

For example, it now possible to create a partitioned table such as this one :

CREATE TABLE t (id INTEGER, ts TIMESTAMP, col1 VARCHAR(64), ... )
PARTITION BY RANGE ( UNIX_TIMESTAMP(ts) ) (
PARTITION p0 VALUES LESS THAN (631148400),
PARTITION p1 VALUES LESS THAN (946681200),
PARTITION p2 VALUES LESS THAN (MAXVALUE)
);

All other expressions involving TIMESTAMP values are now rejected with an error when attempting to create a new partitioned table or to alter an existing partitioned table.

To go from the unix timestamp to a more readable value it is possible to use the FROM_UNIXTIME(ts) function :

mysql> SELECT FROM_UNIXTIME(631148400);
+--------------------------------------+
| FROM_UNIXTIME(631148400) |
+--------------------------------------+
| 1990-01-01 00:00:00
+--------------------------------------+

With MySQL 5.1.43 it is important to note that partitioning can only be done on an integer value column or through the following restricted set of functions that return integer :

ABS() MOD() CEILING() FLOOR() EXTRACT() DATEDIFF()
DAY() DAYOFMONTH() DAYOFWEEK() DAYOFYEAR() TO_DAYS() WEEKDAY()
HOUR() MINUTE() SECOND() TIME_TO_SEC() MICROSECOND()
YEAR() YEARWEEK() QUARTER() MONTH()

To help Giuseppe Maxia has proposed some tricks that make partitioning by date in 5.1 a bit less painful :
A quick usability hack with partitioning

The partition helper - Improving usability with MySQL 5.1 partitioning

And to partition on a non integer column :
Partitioning with non integer values using triggers

And do not forget that in 5.1 :
All the columns used in the table's partitioning expression mus t be used in every unique key on the table.
This also includes the table's primary key, since it is by definition a unique key.
So in the above example with the UNIX_TIMESTAMP function the timestamp column must be part of the primary key and also part of all the unique keys existing on the table.
An important restriction also to remember is that the number of partitions is limited to 1024 (including sub partitions).

MySQL 5.5. partitioning offers more functionalities :
With RANGE and LIST partitioning it will be possible to partition by non-integer columns like dates or strings (using the new COLUMNS keyword).
MySQL 5.5 Partitioning with the he COLUMNS keyword also allow to use multiple columns to define a partition.
The new TRUNCATE PARTITION allow to empty a partition in a more efficient way than delete or drop partition + recreation of the partition.
The new function TO_SECONDS is available to partition on time intervals smaller than one day. It is taken into account by partition pruning.

To explain 5.5 partitioning new features there is the interesting article by Giuseppe Maxia :
A deep look at MySQL 5.5 partitioning enhancements