MySQL 8.0 new features in real life applications: roles and recursive CTEs

MySQL 8.0 roles and recursive Common Table ExpressionsI am happy that the MySQL team is, during the last years, blogging about each major feature that MySQL Server is getting; for example, the series on Recursive Common Table Expressions. Being extremely busy myself, I appreciate taking the time to share details with the advantage of getting an “insider” point of view.

However, first party guides and examples can be seen at times as not terrible useful at first- as normally they are done with synthetic, artificial examples and not real-life ones. I want to share with you 2 examples of how 2 of the upcoming features of MySQL 8.0 will be useful to us, with examples that we already use or plan to use for the Wikimedia Foundation databases: roles and recursive CTEs.

Roles

MySQL rolesGiusseppe “data charmer” Maxia presented recently at FOSDEM 2018 how the newly introduced roles are going work (you can watch and read his presentation at the FOSDEM website) and he seemed quite skeptical about the operational side of it. I have to agree -some of the details are not straightforward as how users and roles work on other environments, but I have to say we have been using database roles for a while with MariaDB without any problems in the last year.

User roles were introduced on MariaDB 10.0, although for us they were unusable until MariaDB 10.1, and its implementation seems to be similar, if not exactly the same as that of MySQL 8.0 (side note. I do not know why that is ̣—is it an SQL standard? Was the implementation inspired by Oracle or other famous database? does it have a common origin? Please tell me in the comments if you know the answer).

If you use MySQL for a single large web-like application, roles may not be very useful to you. For example, for the Mediawiki installation that supports Wikipedia, only a few accounts are setup per database server -one or a few for the application, plus those needed for monitoring, backups and administration (please do not use a single root account for all in yours!).

labsdb replication diagram
LabsDB databases provided a vital service to the community by providing direct access to community-accessible database replicas of production data.

However, we also provide free hosting and other IT services for all developers (including volunteers) to create webs, applications and gadgets that could be useful to the full Wikimedia community. Among those services, we have a data service where we provide a replica of a sanitized database with most of the data of the production mediawiki installation, plus additional database accounts to store their own application data. This requires a highly multi-tiered MySQL/MariaDB installation, with thousands of accounts per server. In the past, each account had to be managed separately, with its own grants and accounts limits- this was a nightmare, and the little accountability could easily lead to security issues. Wildcards were used to assign grants, which was a really bad idea- because wildcards not only provide grants to current databases, also to future databases that match the pattern- and that is very dangerous- wrong data sets could accidentally end up on the wrong servers and all users would get automatically access. Also, every time there was some kind of maintenance where all users had to be added or revoked certain grant (not frequent, but could happen), a script had to be run to do that in a loop for each account. Also, there are other accounts aside from user accounts (the administration and monitoring ones), but aside from a specific string pattern, there was no way to differentiate user from administration or monitoring accounts.

Our cloud databases were the first we upgraded to MariaDB 10.1 exclusively to get a transparent role implementation (SET default role). We created a template role (e.g. ‘labsdbuser‘), which all normal users will be granted to (you can see the actual code on our puppet repo):

GRANT USAGE ON *.* TO  ...;
GRANT labsdbuser TO ;
SET DEFAULT ROLE labsdbuser FOR ;

, and because it is so compact, we can give a detailed, non-wildcard-based selection of grants to to the labsdbuser generic role.

If we had to add a new grant (or revoke it) to all users, we just have to run:

GRANT/ REVOKE  TO / FROM labsdbuser;

only once per server, and all users affected will get it automatically (well, they have to log off and on, but that was true of any grant).

Can the syntax be simpler or the underlying tables confusing? Sure, but in practice -at least for us-, we rarely have to handle role changes- they don’t tend to be very dynamic. However, it simplified a lot the creation, accountability, monitoring and administration of user accounts for us.

Recursive CTEs

CTEs are one of the other large syntactic sugar added on 8.0 (they were also added on MariaDB 10.2). Unlike the roles, however, this is something that application developers will benefit from -roles are mostly helpful for Database Administrators.

As I mentioned before, we have upgraded our servers at most to MariaDB 10.1 (stability is very important for us), so I have not yet been able to play on production with them. Also, because Mediawiki has to support older applications, it might take years to see those being used on web requests for wikis. I see them, however, being interesting first for analytics and for the cloud community applications I mentioned previously.

Mediawiki has some hierarchical data stored on their tables; probably the most infamous one is the category tree. As of 2018 (this may change in the future), the main way to group Wiki pages and files is by using categories. However, unlike the concept of tags, categories may contain other more specialized categories.

Diagram of category relationships
Mediawiki/Wikipedia category system can become quite complex.
Image License: CC-BY-SA-3.0 Author: Gurch at English Wikipedia

For example, the [[MySQL]] article has the following categories:

(You can obtain a real time list with a Mediawiki API call, too)

However, if you go to the category Database management systems, you will not find it there, because you have to browse first thought Database management systems > Relational database management systems > MySQL; or maybe Database management systems > Database-related software for Linux > RDBMS software for Linux > MySQL

The relationship between pages and categories are handled on the classical “(parent, child) table” fashion, the categorylinks table, which has the following structure:

CREATE TABLE /*_*/categorylinks (
​  -- Key to page_id of the page defined as a category member.
​  cl_from int unsigned NOT NULL default 0,
​
​  -- Name of the category.
​  -- This is also the page_title of the category's description page;
​  -- all such pages are in namespace 14 (NS_CATEGORY).
​  cl_to varchar(255) binary NOT NULL default '',
​...
​  PRIMARY KEY (cl_from,cl_to)
​) /*$wgDBTableOptions*/;

CREATE INDEX /*i*/cl_sortkey ON /*_*/categorylinks (cl_to,cl_type,cl_sortkey,cl_from);

The table has been simplified, we will not get on this post into collation issues- that would be material for other separate discussion. Also ignore the fact that cl_to links to a binary string rather than an id, there are reasons for all of that, but cannot enter into details now.

If you want to play with the real thing, you can download it from here, it is, as of February 2018, only 2GB compressed: https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-categorylinks.sql.gz

This table structure works nicely to answer questions like:

  • “Which articles and subcategories are directly on a given category (Database_management_systems)?”
    SELECT IF(page_namespace = 14, 'category', 'article') as type, page_title 
    FROM categorylinks 
    JOIN page 
    ON page_id = cl_from 
    WHERE cl_to = 'Database_management_systems'
    
    

    You can see the live results of this query.

  • “Which categories does a particular article (MySQL) contain?”
    SELECT cl_to
    FROM categorylinks
    JOIN page
    ON page_id = cl_from
    WHERE page_namespace = 0 AND page_title = 'MySQL';
    
    

    You can see the live results of this query.

However, if you want to answer the question “Give me all articles in a particular category or its subcategories”, this structure is not well suited to answer this question. Or at least it didn’t use to be, for older versions of MySQL and MariaDB. Alternative table design structures have been proposed in the past (both for Mediawiki and for the general problem) to solve this commonly found structure, but not all will be a applicable, as the Wiki category system is very flexible, and not even strictly a tree- it can contain, for example, loops.

Up to now, custom solutions, such as this answer on stackoverflow, or specially-built applications, such as: Petscan, had to be developed to provide lists of articles queried recursively.

Again, enter recursive CTEs. Instead of having to use external code to apply recursion, or an awkward stored procedure, we can just send the following sql:

WITH RECURSIVE cte (cl_from, cl_type) AS
(
    SELECT cl_from, cl_type FROM categorylinks WHERE cl_to = 'Database_management_systems' -- starting category
    UNION
    SELECT categorylinks.cl_from, categorylinks.cl_type FROM cte JOIN page ON cl_from = page_id JOIN categorylinks ON page_title = cl_to WHERE cte.cl_type = 'subcat' -- subcat addition on each iteration
)
SELECT page_title FROM cte JOIN page ON cl_from = page_id WHERE page_namespace = 0 ORDER BY page_title; -- printing only articles in the end, ordered by title

With results:

+-------------------------------------------------------------+
| page_title                                                  |
+-------------------------------------------------------------+
| 4th_Dimension_(software)                                    |
| A+_(programming_language)                                   |
| ABC_(programming_language)                                  |
| ACID                                                        |
| ADABAS                                                      |
| ADO.NET                                                     |
| ADOdb                                                       |
| ADOdb_Lite                                                  |
| ADSO                                                        |
| ANSI-SPARC_Architecture                                     |
| Adabas_D                                                    |
| Adaptive_Server_Enterprise                                  |
...
| Yellowfin_Business_Intelligence                             |
| Zope_Object_Database                                        |
+-------------------------------------------------------------+
511 rows in set (0.02 sec)

It looks more complicated than it should, because we need to join with the page table in order to get page ids from titles; but otherwise the idea is simple: get a list of pages, if they are subcategories, query them recursively, otherwise, add them to the list. By using UNION instead of UNION ALL we make sure we do not explore the same subcategory twice. Note how we end up with a list larger than the 143 items are directly on the category.

If we fall in an infinite loop, or a very deep chain of relationships, the variable limiting the number of executions, cte_max_recursion_depth can be tuned. By default it fails if we do 1001 iterations.

I don’t intend to provide an in depth guide about CTEs, check the 8.0 manual for more information and the above mentioned MySQL Team blog posts.

So tell me, do you think these or other recently-added features will be as useful for you as the seem to be for us? What is your favourite one recently added to MySQL or MariaDB? Send me a comment or a message to @jynus.

Finding out the MySQL performance regression due to kernel mitigation for Meltdown CPU vulnerability

Meltdown logo as a shield melting downUpdate: I included the results for when PCID is disabled, for comparison, as a worse case scenario.

After learning about Meltdown and Spectre, I waited patiently to get a fix from my OS vendor. However, there were several reports of performance impact due to the kernel mitigation- for example on the PostgresQL developers mailing list there was reports of up to 23% throughput loss; Red Hat engineers report a regression range of 1-20%, but setting OLTP systems as the worse type of workload. As it will be highly dependent on the hardware and workload, I decided of doing some test myself for the use cases I need.

My setup

It is similar to that of my previous tests:

Hardware -desktop grade, no Xeon or proper RAID:

  • Intel(R) Core(TM) i7-4790K CPU @ 4.0GHz (x86_64 Quad-core with hyperthreading) with PCID support (disabling pcid with “nopcid” kernel command line will also be tested)
  • 32 GB of RAM
  • Single, desktop-grade, Samsung SSD 850 PRO 512GB

OS and configuration:

  • Debian GNU/Linux 9.3 “Stretch”, comparing kernels:
    • 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3+deb9u1 (no mitigation)
    • 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (latest kernel with security updates backported, including pti enabled according to security-announces)
  • datadir formatted as xfs, mounted with noatime option, all on top of LVM
  • MariaDB Server 10.1.30 compiled from source, queried locally through unix socket

The tests performed:

  • The single-thread write with LOAD DATA
  • A read-only sysbench with 8 and 64 threads

The results

LOAD DATA (single thread)

We have been measuring LOAD DATA performance of a single OpenStreetMap table (CSV file) in several previous tests as we detected a regression on some MySQL versions with single-thread write load. I believe it could be a interesting place to start. I tested both the default configuration and another more similar to WMF production:

Load timerows/s
Unpatched Kernel, default configuration229.4±1s203754
Patched Kernel, default configuration227.8±2.5s205185
Patched Kernel, nopcid, default configuration227.9±1.6s205099
Unpatched Kernel, WMF configuration163.5±1s285878
Patched Kernel, WMF configuration163.3±1s286229
Patched Kernel, nopcid, WMF configuration165.1±1.3s283108

No meaningful regressions are observed in this case between the default patched and unpatched kernels- the variability is within the measured error. The nopcid could be showing some overhead, but the overhead (around 1%) is barely above the measuring error. The nopcid option is interesting not because the hardware support, but because of the kernel support- backporting it could be a no-option for older distro versions, as Moritz says on the comments.

It is interesting to notice, although offtopic, that while the results with the WMF “optimized” configuration have become better compared to previous years results (due, most likely, to improved CPU and memory resources); the defaults have become worse- a reminder that defaults are not a good metric for comparison.

This is not a surprising result, a single thread is not a real OLTP workload, and more time will be wasted on io waits than the necessary syscalls.

RO-OLTP

Let’s try with a different workload- let’s use a proper benchmarking tool, create a table and perform point selects with it, with 2 different levels of concurrency- 8 threads and 64 threads:

sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=test --mysql-user=test prepare
sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=test --mysql-user=test --max-time=120 --oltp-read-only=on --max-requests=0 --num-threads={8, 64} run
TPSSELECTs/s95 percentile of latency(ms)
Unpatched Kernel, 8 threads7333±30100953±10001.15±0.05
Patched Kernel, 8 threads6867±15096140±20001.20±0.01
Patched Kernel, nopcid, 8 threads6637±2092915±2001.27±0.05
Unpatched kernel, 64 threads7298±50102176±100043.21±0.15
Patched Kernel, 64 threads6768±4094747±100043.66±0.15
Patched Kernel, nopcid, 64 threads6648±1093073±10043.96±0.10

In this case we can observe around a 4-7% regression in throughput if pcid is enabled. If pcid is disabled, they increase up to 9-10% bad, but not as bad as the warned by some “up to 20%”. If you are in my situation, and upgrade to stretch would be worth to get the pcid support.

Further testing would be required to check at what level of concurrency or what kind of workloads will work better or worse with the extra work for context switch. It will be interesting to measure it with production traffic, too, as some of the above could be nullified when network latencies are added to the mix. Further patches can also change the way mitigation works, plus probably things like having PCID support is helping transparently on all modern hardware.

Have you detected a larger regression? Are you going to patch all your databases right away? Tell me at @jynus.

How to install MySQL Server on Debian Stretch

For the impatient:

# echo -e "deb http://repo.mysql.com/apt/debian/ stretch mysql-5.7\ndeb-src http://repo.mysql.com/apt/debian/ stretch mysql-5.7" > /etc/apt/sources.list.d/mysql.list
# wget -O /tmp/RPM-GPG-KEY-mysql https://repo.mysql.com/RPM-GPG-KEY-mysql
# apt-key add /tmp/RPM-GPG-KEY-mysql
# apt update
# apt install mysql-server

Debian Ying Yang

In the latest stable version of Debian, if you ask to install mysql-server, you now get installed mariadb automatically, with no (evident) way of installing Oracle’s MySQL. Any major version upgrade has to be done carefully (not only for MariaDB, but also for MySQL and Postgres), and I bet that a MySQL 5.5 to MariaDB 10.1 will cause a huge confusion. Not only it will fail user expectations, I think this will cause large issues now that MariaDB has chosen to become a “hard” fork, and become incompatible in many ways with MySQL. Not only the server upgrade will cause user struggle, the connector is probably going to cause pain to users (as it has already been noticed in some of my infrastructure).

In order to try to be helpful for those MySQL users that may be looking for help, here is how to install and setup MySQL server in Debian 9 “Stretch”, using MySQL upstream repo. The commands shown above may not be the best course of action on all cases (be careful when downloading things from then Internet), and as a disclaimer, you should read the official documentation on how to use Oracle’s MySQL apt repository in advance. The following is a more verbose, step-by step explanation, plus some extended explanations on why I chose to recommend this method in particular (and its disadvantages).

  1. Install Oracle’s APT Repositories

    MySQL Release Engineering team provides ready-to-use apt repositories with most of it software, including the server and many of its utilities. To enable the server repository, edit your apt sources file (or better, create a new one by executing:

    # nano /etc/apt/sources.list.d/mysql.list

    And add the following lines:

    deb http://repo.mysql.com/apt/debian/ stretch mysql-5.7
    deb-src http://repo.mysql.com/apt/debian/ stretch mysql-5.7
    

    Here, I have chosen to install MySQL 5.7, but 5.6 and 8.0 (not yet stable as of June 2017) are also available for Debian stable. You can also chose to enable mysql-tools (normally here they add the connectors, mysql-router and the mysql-utilities, but right now it is empty) and mysql-apt-config the repo where the automatic apt configurator package is. While the mysql-apt-config utility is nice to have, I like to have control over my repositories manually, even if that means having to do this again in the future.

    Before starting to use the repo, you have to add the repo public key as trusted, for that I suggest:

    # wget -O /tmp/RPM-GPG-KEY-mysql https://repo.mysql.com/RPM-GPG-KEY-mysql
    # apt-key add /tmp/RPM-GPG-KEY-mysql
    

    RPM and APT repos seem to share the GPG key, and that is at https://repo.mysql.com/RPM-GPG-KEY-mysql (it is the same one that mysql-apt-config utility adds automatically on install). Not doing so will not make the rest of the install fail, but it will warn you due to the lack of a signed key.

    If you run apt-key list you will see a new trusted key for apt:

    # apt-key list 
    /etc/apt/trusted.gpg
    --------------------
    pub   dsa1024 2003-02-03 [SCA] [expires: 2019-02-17]
          A4A9 4068 76FC BD3C 4567  70C8 8C71 8D3B 5072 E1F5
    uid           [ unknown] MySQL Release Engineering 
    [...]
    

    The advantages of using a repository rather than a one-time install is that you will get automatic updates for minor versions that may correct bugs and security issues (with minimal feature changes).

  2. Refresh your available package list

    Just run:

    # apt update
    Ign:1 http://softlibre.unizar.es/debian stretch InRelease
    Hit:2 http://softlibre.unizar.es/debian stretch-updates InRelease
    Hit:3 http://security.debian.org/debian-security stretch/updates InRelease
    Hit:4 http://softlibre.unizar.es/debian stretch Release                  
    Get:5 http://repo.mysql.com/apt/debian stretch InRelease [14.2 kB]       
    Get:7 http://repo.mysql.com/apt/debian stretch/mysql-5.7 Sources [870 B]
    Get:8 http://repo.mysql.com/apt/debian stretch/mysql-5.7 amd64 Packages [5,643 B]
    Fetched 20.7 kB in 6s (3,320 B/s)                                           
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    All packages are up to date.
    

    You should see a reference to repo.mysql.com with no errors. You probably have other main repos or pending updates, that is ok.

  3. Install MySQL Server:

    # apt install mysql-server
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following additional packages will be installed:
      libaio1 libatomic1 libmecab2 libnuma1 mysql-client mysql-community-client
      mysql-community-server
    The following NEW packages will be installed:
      libaio1 libatomic1 libmecab2 libnuma1 mysql-client mysql-community-client
      mysql-community-server mysql-server
    0 upgraded, 8 newly installed, 0 to remove and 0 not upgraded.
    Need to get 36.7 MB of archives.
    After this operation, 253 MB of additional disk space will be used.
    Do you want to continue? [Y/n] Y
    Get:1 http://repo.mysql.com/apt/debian stretch/mysql-5.7 amd64 mysql-community-client amd64 5.7.18-1debian9 [8,859 kB]
    Get:2 http://softlibre.unizar.es/debian stretch/main amd64 libaio1 amd64 0.3.110-3 [9,412 B]
    Get:3 http://softlibre.unizar.es/debian stretch/main amd64 libatomic1 amd64 6.3.0-18 [8,920 B]
    Get:4 http://softlibre.unizar.es/debian stretch/main amd64 libnuma1 amd64 2.0.11-2.1 [33.3 kB]
    Get:5 http://softlibre.unizar.es/debian stretch/main amd64 libmecab2 amd64 0.996-3.1 [256 kB]
    Get:6 http://repo.mysql.com/apt/debian stretch/mysql-5.7 amd64 mysql-client amd64 5.7.18-1debian9 [12.6 kB]
    Get:7 http://repo.mysql.com/apt/debian stretch/mysql-5.7 amd64 mysql-community-server amd64 5.7.18-1debian9 [27.5 MB]
    Get:8 http://repo.mysql.com/apt/debian stretch/mysql-5.7 amd64 mysql-server amd64 5.7.18-1debian9 [12.6 kB]
    Fetched 36.7 MB in 4s (8,495 kB/s)                 
    Preconfiguring packages ...
    Selecting previously unselected package libaio1:amd64.
    (Reading database ... 27516 files and directories currently installed.)
    Preparing to unpack .../0-libaio1_0.3.110-3_amd64.deb ...
    Unpacking libaio1:amd64 (0.3.110-3) ...
    Selecting previously unselected package libatomic1:amd64.
    Preparing to unpack .../1-libatomic1_6.3.0-18_amd64.deb ...
    Unpacking libatomic1:amd64 (6.3.0-18) ...
    Selecting previously unselected package libnuma1:amd64.
    Preparing to unpack .../2-libnuma1_2.0.11-2.1_amd64.deb ...
    Unpacking libnuma1:amd64 (2.0.11-2.1) ...
    Selecting previously unselected package mysql-community-client.
    Preparing to unpack .../3-mysql-community-client_5.7.18-1debian9_amd64.deb ...
    Unpacking mysql-community-client (5.7.18-1debian9) ...
    Selecting previously unselected package mysql-client.
    Preparing to unpack .../4-mysql-client_5.7.18-1debian9_amd64.deb ...
    Unpacking mysql-client (5.7.18-1debian9) ...
    Selecting previously unselected package libmecab2:amd64.
    Preparing to unpack .../5-libmecab2_0.996-3.1_amd64.deb ...
    Unpacking libmecab2:amd64 (0.996-3.1) ...
    Selecting previously unselected package mysql-community-server.
    Preparing to unpack .../6-mysql-community-server_5.7.18-1debian9_amd64.deb ...
    Unpacking mysql-community-server (5.7.18-1debian9) ...
    Selecting previously unselected package mysql-server.
    Preparing to unpack .../7-mysql-server_5.7.18-1debian9_amd64.deb ...
    Unpacking mysql-server (5.7.18-1debian9) ...
    Setting up libatomic1:amd64 (6.3.0-18) ...
    Setting up libnuma1:amd64 (2.0.11-2.1) ...
    Setting up libmecab2:amd64 (0.996-3.1) ...
    Processing triggers for libc-bin (2.24-11+deb9u1) ...
    Setting up libaio1:amd64 (0.3.110-3) ...
    Processing triggers for systemd (232-25) ...
    Processing triggers for man-db (2.7.6.1-2) ...
    Setting up mysql-community-client (5.7.18-1debian9) ...
    Setting up mysql-client (5.7.18-1debian9) ...
    Setting up mysql-community-server (5.7.18-1debian9) ...
    Setting up mysql-server (5.7.18-1debian9) ...
    Processing triggers for libc-bin (2.24-11+deb9u1) ...
    

    If everything has gone according to plan, after asking you for permission to install it and its dependencies, you should have NOT installed mariadb-server, but the real mysql server instead. If at some stage you got an error, check your spelling or your permissions- all of the above steps should have been run with superuser or root permissions, and not as a normal user. That means using sudo before each command, su -c CMD or logging in as root in advance.

    If you do not have network on the server you are trying to install MySQL, you may want to download the .deb bundle for Debian9 from the download page.

  4. Start, stop and manage your server

    These MySQL packages initialize and start MySQL server by default:

    # systemctl status mysql
    ● mysql.service - MySQL Community Server
       Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset:
       Active: active (running) since Sun 2017-06-25 12:16:50 PDT; 31min ago
      Process: 4399 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/my
      Process: 4362 ExecStartPre=/usr/share/mysql/mysql-systemd-start pre (code=e
     Main PID: 4401 (mysqld)
        Tasks: 27 (limit: 4915)
       CGroup: /system.slice/mysql.service
               └─4401 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mys
    
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.342805Z 0 [Note] IPv
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.345596Z 0 [Note]   -
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.348320Z 0 [Note] Ser
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.421804Z 0 [Note] Eve
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.426584Z 0 [Note] /us
    Jun 25 12:16:50 debian mysqld[4399]: Version: '5.7.18'  socket: /var/run/mysq
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.430667Z 0 [Note] Exe
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.430739Z 0 [Note] Beg
    Jun 25 12:16:50 debian mysqld[4399]: 2017-06-25T19:16:50.531384Z 0 [Note] End
    Jun 25 12:16:50 debian systemd[1]: Started MySQL Community Server
    

    You can manage the service with service wrapper or with the native systemd command. Remember that it is possible that earlier os versions had a package that still used backwards init.d compatibility- it is not the case anymore. However, execution of /etc/init.d/mysql will be captured and converted transparently into an equivalent systemctl command. But get accustomed to say goodbye to the horrible mysqld_safe script and its options (but also its vulnerabilities).

    # systemctl restart mysql
    # systemctl reload mysql
    Failed to reload mysql.service: Job type reload is not applicable for unit mysql.service.
    See system logs and 'systemctl status mysql.service' for details.
    # systemctl status mysql
    ● mysql.service - MySQL Community Server
       Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset:
       Active: active (running) since Sun 2017-06-25 12:55:43 PDT; 18s ago
      Process: 4522 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/my
      Process: 4485 ExecStartPre=/usr/share/mysql/mysql-systemd-start pre (code=e
     Main PID: 4524 (mysqld)
        Tasks: 27 (limit: 4915)
       CGroup: /system.slice/mysql.service
               └─4524 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mys
    
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.670560Z 0 [Note] IPv
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.673504Z 0 [Note]   -
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.674863Z 0 [Note] Ser
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.750799Z 0 [Note] Eve
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.755752Z 0 [Note] /us
    Jun 25 12:55:43 debian mysqld[4522]: Version: '5.7.18'  socket: /var/run/mysq
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.755892Z 0 [Note] Exe
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.755929Z 0 [Note] Beg
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.863316Z 0 [Note] End
    Jun 25 12:55:43 debian systemd[1]: Started MySQL Community Server.
    
    # journalctl -u mysql
    Jun 25 12:55:41 debian systemd[1]: Stopped MySQL Community Server.
    Jun 25 12:55:41 debian systemd[1]: Starting MySQL Community Server...
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.028273Z 0 [Warning] 
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.038562Z 0 [Note] /us
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.076552Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.080224Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.080428Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.080508Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.080651Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.080739Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.094601Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.096806Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.116165Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.201413Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.215069Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.241133Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.325375Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.329688Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.484928Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.496504Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.499790Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.506852Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.563491Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.570659Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.573444Z 0 [Note] Plu
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.603637Z 0 [Note] Inn
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.637102Z 0 [Note] Fou
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.646241Z 0 [Warning] 
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.666576Z 0 [Note] Ser
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.670560Z 0 [Note] IPv
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.673504Z 0 [Note]   -
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.674863Z 0 [Note] Ser
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.750799Z 0 [Note] Eve
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.755752Z 0 [Note] /us
    Jun 25 12:55:43 debian mysqld[4522]: Version: '5.7.18'  socket: /var/run/mysq
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.755892Z 0 [Note] Exe
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.755929Z 0 [Note] Beg
    Jun 25 12:55:43 debian mysqld[4522]: 2017-06-25T19:55:43.863316Z 0 [Note] End
    Jun 25 12:55:43 debian systemd[1]: Started MySQL Community Server.
    
  5. Use your server

    This package sets up socket_authentication automatically for the user root, so no need to setup a password- accessing as the root unix user to the mysql root user will be directly granted:

    # mysql -u root
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 3
    Server version: 5.7.18 MySQL Community Server (GPL)
    
    Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> SELECT current_user();
    +----------------+
    | current_user() |
    +----------------+
    | root@localhost |
    +----------------+
    1 row in set (0.00 sec)
    

    The default configuration binds the server to 127.0.0.1, you may want to setup your users and bind it to a public address before put it into production. Config can be found at: /etc/mysql/mysql.conf.d/mysqld.cnf.

Please use @jynus to tell me if this was useful to you, or to tell me where I am wrong.

Testing (again) LOAD DATA on MySQL 5.6, 5.7, 8.0 (non-GA) and MariaDB 10.0, 10.1 and 10.2 (non-GA)

load data testingI’ve been lately compiling and generating .deb packages for several MySQL and MariaDB recent versions, and I wanted to try them more in depth -specially MySQL 8.0 and MariaDB 10.2, both of which are still in development.

Several people have already given their first impressions (or will do soon), and testing early is the best way to catch bugs and regressions, and get them fixed before the official release. In fact, as I will comment later, I ran into breaking bugs on both MySQL 8.0 and MariaDB 10.2, which I immediately reported or joined others on the bug report.

Last time I did a similar test, I found an important 30% regression in single-thread performance between 5.6 and 5.7, and I wasn’t the only one reporting single-thread performance concerns. This is why I included 3 generations of major database versions, the latest minor release -as of December 2016- of the old but stable 5.6/10.0; the latest GA versions 5.7/10.1 and the next, in-development versions 8.0/10.2 (exact versions shown bellow).

Very recently, there was a call for better MySQL benchmarks, and I cannot agree more with the general idea. However, I have to make some disclaimers: the following are “tests results”, closer to regression testing than to benchmarking. Benchmarking is hard, and I do not have the time or the resources to do them properly and extensively. What I wanted to do is to do a very specific test of a very specific operation (LOAD DATA in a single thread) under very specific circumstances/configuration [in some cases, bad configuration practices] to see if a previously-occurred problem was still there. This is not a mere thought experiment, it will help me tune better the import/backup recovery process, and did allow me get familiarized with the newest versions’ particular idiosyncrasies. Clearly, as you will discover, I do not yet know how to tune the latest unreleased versions, (who knows at this point), so join me on the discovery process.

Secondly, the test I will be doing (LOAD DATA) did not include secondary indexes or JOINs. Again, this will test import times, but not other more common operations like point selects, updates and range selects. Thirdly, I am comparing both Generally Available (stable) versions and versions still in development. The latter can change a lot between now and the release date. I repeat again: This will be in no way representative of the overall performance of MySQL or MariaDB versions. If you get away with the idea that “X is better than Y” based on a chart shown here, you will probably be very wrong. You have been warned.

I hope, however, that some of these results will be helpful to some fellow DBAs and MySQL/MariaDB users and develpers, and that is why I am sharing them early. You may also help me explain some of the results I am getting, which I may not 100% understand yet.

The setup

Server versions:

  • Oracle MySQL 5.6.34
  • Oracle MySQL 5.7.16
  • Oracle MySQL 8.0.0-dmr (not “generally available”)
  • MariaDB Server 10.0.28
  • MariaDB Server 10.1.19
  • MariaDB Server 10.2.2 (non “generally available”)

All versions were compiled from the source code downloaded from its official website with the recommended options (-DBUILD_CONFIG=mysql_release).

Hardware (desktop grade- no Xeon or a proper RAID setup):

  • Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz (x86_64 Quad-core with hyperthreading)
  • 16 GB of RAM
  • Single, dektop-grade, Samsung SSD 850 PRO 512GB

OS and configuration:

  • Debian GNU/Linux 8.6 “Jessie”
  • datadir formatted as xfs, mounted with noatime option, all on top of LVM

Several server configurations where used, commented on each individual test.

Dataset:

  • I used again the nodes.csv I used last time, a 3,700,635,579-byte text (tab-separated) file containing 46,741,126 lines (rows) with an old OSM node data dump for Spain:
171773  38.6048402      -0.0489871      4       2012-08-25 00:37:46     12850816        472193  rubensd
171774  38.6061981      -0.0496867      2       2008-01-19 10:23:21     666916  9250    j3m
171775  38.6067166      -0.0498342      2       2008-01-19 10:23:21     666916  9250    j3m
171776  38.6028122      -0.0497136      5       2012-08-25 00:26:40     12850816        472193  rubensd
171933  40.4200658      -3.7016652      6       2011-11-29 05:37:42     9984625 33103   sergionaranja

Those rows will be loaded using the load_data_04.py script I used last time, which basically creates a table like this:

CREATE TABLE `nodes` (
  `id` bigint PRIMARY KEY,
  `lat` decimal(9,7),
  `lon` decimal(10,7),
  `version` int,
  `timestamp` timestamp,
  `changeset` bigint,
  `uid` bigint,
  `user` varchar(255)
);

And then executes LOAD DATA, then commits:

# finishing
if (options.engine == 'MyISAM'):
    cursor.execute('FLUSH TABLES')
else:
    db.commit()
 
cursor.close()
db.close()

General impressions and breaking bugs found

Bad bug :-(
Setting up the server for the first time in MySQL and MariaDB has now differed. While MariaDB continues using the mysql_install_db script, MySQL, since 5.7, uses the mysqld server binary directly. Also, while I have to thank Oracle for focusing on security, if you are going to setup a server just for pure testing, the default account creation options can be a bit annoying. I have to say thanks because there is an additional `–initialize-insecure` which simplifies the setup for one-time setups like this benchmark or a jenkins test, and that you probably should not be using on a proper production. Maybe it could be helpful for automatic deploys, where accounts are taken care by a script.

For MySQL, I found a problem in 8.0.0 in which mysqld --initialize did not work with binary collations (neither it started if I did the initialization with the default collation, and then started the server with binary as the default encoding). The error I got was:

mysqld: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘binary’ at line 1
[ERROR] Data Dictionary initialization failed.

This was clearly caused by the latest feature, the new InnoDB data dictionary. I reported this as Bug #83706 and apparently it will be fixed for 8.0.1.

Binary collation is very important for Wikipedias, because some time ago (and even recently, with the latest extensions to the standard) full Unicode support was difficult to get -and we need to support 300 languages. Also, we have certain needs like custom collations support (e.g. several collations on the same table, numeric sorting). As a consequence of that, most text data is now stored in raw binary. For these tests, I had to disable the binary character for some tests for 8.0.0, which I have appropriately noted it below.

For MariaDB, I ran into an issue that had already been reported (MDEV-10540), in which mysql_install_db got hanged if the log_bin option was enabled. While this bug could be workarounded by doing so just for the installation, the option is so common and important that I think the issue should be solved as soon as possible.

I also run into MDEV-1124, MariaDB Server promoting an external business, a mistake which hopefully will be fixed by the release of 10.1.20.

In order to perform the tests, I had to destroy and recreate the datadir, starting from scratch on each run (rm + --initialize). This is also a testimony that you no longer have binary compatibility between server vendors, due to the differences on the mysql schema, performance schema tables, the new data dictionary, etc.

Another thing to notice is that, because of the latest security patches, you have to enable explicitly the secure_file_priv in order to use the LOAD DATA query. You probably do not want to do that on a production server, more details about the issue on CVE-2016-6662 and the other recently related issues.

Default config testing

While testing with the default config is the #1 sin one could do while doing benchmarking (because you are testing the vendor defaults, not a real performance comparison), my test is silly and simple enough for this to be interesting, and spending some time there. Long gone are the times in which MySQL’s defaults were horrible, and Oracle seems very focused lately on trying to setup good defaults that would be useful for the majority of people. But let’s also be honest, these options (together with what packagers setup) are also the ones that 90% of the mysql users out there will use because they are not running dedicated services that require a lot of tuning nor have a dedicated DBA to do it.

So I went and setup mysql with just a bunch of administrative options. In particular I would like to note that by default:

  • No binary log enabled in all versions
  • Default buffer pool size is 134217728 for all versions
  • innodb_log_file_size defaults to 50MB
[client]
port   = 3306
socket = /tmp/mysql.sock

[mysqld]

# administrative options
user       = mysql
socket     = /tmp/mysql.sock
port       = 3306
datadir    = /srv/sqldata
basedir    = /opt/mysql # mysql is a symbolic link to each basedir
tmpdir     = /srv/tmp
# prevent imports/exports to arbitrary dirs
secure_file_priv               = /tmp

Some interesting config differences between 5.6 and MariaDB 10.0 to have into account: performance schema is disabled on MariaDB, and many buffers and caches have different size

< # This is MariaDB 10.0
> # This is MySQL 5.6
25,26c5,7
< back_log	150
---
> back_log	80
61a43
> default_tmp_storage_engine	InnoDB
67a51,53
> end_markers_in_json	OFF
> enforce_gtid_consistency	OFF
> eq_range_index_dive_limit	10
101,103c82
< host_cache_size	128
---
> host_cache_size	279
208c170
< innodb_open_files	400
---
> innodb_open_files	2000
259,261c210
< join_buffer_size	131072
< join_buffer_space_limit	2097152
---
> join_buffer_size	262144
263c212
< key_buffer_size	134217728
---
> key_buffer_size	8388608
292a243
> master_info_repository	FILE
294c245
< max_allowed_packet	1048576
---
> max_allowed_packet	4194304
319d268
< mrr_buffer_size	262144
341,353c289,304
< optimizer_selectivity_sampling_limit	100
< optimizer_switch	index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
< optimizer_use_condition_selectivity	1
< performance_schema	OFF
---
> optimizer_switch	index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on
> performance_schema	ON
388,390c337,338
< query_cache_size	0
< query_cache_strip_comments	OFF
< query_cache_type	ON
---
> query_cache_size	1048576
> query_cache_type	OFF
399a349
> relay_log_info_repository	FILE
415,417c357,358
< secure_auth	OFF
---
> secure_auth	ON
423a371,373
> slave_allow_batching	OFF
> slave_checkpoint_group	512
> slave_checkpoint_period	300
425,426d374
< slave_ddl_exec_mode	IDEMPOTENT
< slave_domain_parallel_threads	0
431,432c379,381
< slave_parallel_max_queued	131072
< slave_parallel_threads	0
---
> slave_parallel_workers	0
> slave_pending_jobs_size_max	16777216
441c390
< sort_buffer_size	2097152
---
> sort_buffer_size	262144
447c396
< sql_mode	
---
> sql_mode	STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION
465,467c414,416
< sync_master_info	0
< sync_relay_log	0
< sync_relay_log_info	0
---
> sync_master_info	10000
> sync_relay_log	10000
> sync_relay_log_info	10000
469,471c418,421
< table_definition_cache	400
< table_open_cache	400
< thread_cache_size	0
---
> table_definition_cache	1400
> table_open_cache	2000
> table_open_cache_instances	1
> thread_cache_size	9
474,479c424
< thread_pool_size	8
< thread_pool_stall_limit	500
< thread_stack	294912
---
> thread_stack	262144

Some interesting config differences between MySQL 5.7 and MySQL 8.0: Innodb removed innodb_file_format and innodb_large_prefix (deprecated in 5.7), only allowing Antilope format for backwards compatibility ( thanks Morgo and Jörg Brühe for the correction). Also, some extra variables related to the new data dictionary.

< # This is MySQL 5.7
> # This is MySQL 8.0
93c93
< ignore_db_dirs	
---
> information_schema_stats	CACHED
122d121
< innodb_checksums	ON
136,138d134
< innodb_file_format	Barracuda
< innodb_file_format_check	ON
< innodb_file_format_max	Barracuda
143c139
< innodb_flush_method	
---
> innodb_flush_method	fsync
163d158
< innodb_large_prefix	ON
165d159
< innodb_locks_unsafe_for_binlog	OFF
208d201
< innodb_stats_sample_pages	8
213d205
< innodb_support_xa	ON
331a324
> performance_schema_error_size	1043
348c341
< performance_schema_max_memory_classes	320
---
> performance_schema_max_memory_classes	350
354c347
< performance_schema_max_rwlock_classes	40
---
> performance_schema_max_rwlock_classes	50
360c353
< performance_schema_max_statement_classes	193
---
> performance_schema_max_statement_classes	201
370a364
> persisted_globals_load	ON
404a399
> schema_definition_cache	256
465a461
> stored_program_definition_cache	256
468d463
< sync_frm	ON
475a471
> tablespace_definition_cache	256

Some interesting config differences between MariaDB 10.1 and MariaDB 10.2: some extra features integrated/gone compatible from the xtradb engine and more recent versions of MySQL; enabling the buffer pool dump by default, also setting Barracuda as the default file format, sql strict mode, new checksum algorithms, deleting deprecated variables, option for CTEs, and server_id now defaults to 1.

< # This is MariaDB 10.1
> # This is MariaDB 10.2
16c16
< aria_recover	NORMAL
---
> aria_recover_options	NORMAL
31c31
< binlog_checksum	NONE
---
> binlog_checksum	CRC32
47c47,48
---
> check_constraint_checks	ON
121c122
< innodb_adaptive_hash_index_partitions	1
---
> innodb_adaptive_hash_index_parts	8
123d123
< innodb_additional_mem_pool_size	8388608
136c136,137
< innodb_buffer_pool_dump_at_shutdown	OFF
---
> innodb_buffer_pool_chunk_size	134217728
> innodb_buffer_pool_dump_at_shutdown	ON
138c139
< innodb_buffer_pool_dump_pct	100
---
> innodb_buffer_pool_dump_pct	25
140c141
< innodb_buffer_pool_instances	8
---
> innodb_buffer_pool_instances	1
142c143
< innodb_buffer_pool_load_at_startup	OFF
---
> innodb_buffer_pool_load_at_startup	ON
144d144
< innodb_buffer_pool_populate	OFF
148c148
< innodb_checksum_algorithm	INNODB
---
> innodb_checksum_algorithm	crc32
150d149
< innodb_cleaner_lsn_age_factor	HIGH_CHECKPOINT
158d156
< innodb_corrupt_table_action	assert
161a160
> innodb_default_row_format	dynamic
171d169
< innodb_empty_free_list_algorithm	BACKOFF
177d174
< innodb_fake_changes	OFF
180c177
< innodb_file_format	Antelope
---
> innodb_file_format	Barracuda
182c179
< innodb_file_format_max	Antelope
---
> innodb_file_format_max	Barracuda
183a181
> innodb_fill_factor	100
187a186
> innodb_flush_sync	ON
192d190
< innodb_foreground_preflush	EXPONENTIAL_BACKOFF
210,212c208
< innodb_kill_idle_transaction	0
< innodb_large_prefix	OFF
< innodb_lock_schedule_algorithm	fcfs
---
> innodb_large_prefix	ON
214d209
< innodb_locking_fake_changes	ON
221c212
< innodb_log_checksum_algorithm	INNODB
---
> innodb_log_checksums	ON
225a217
> innodb_log_write_ahead_size	8192
227,228d218
< innodb_max_bitmap_file_size	104857600
< innodb_max_changed_pages	1000000
230c220
< innodb_max_dirty_pages_pct_lwm	0.001000
---
> innodb_max_dirty_pages_pct_lwm	0.000000
233c223
< innodb_mirrored_log_groups	1
---
> innodb_max_undo_log_size	1073741824
243a234
> innodb_page_cleaners	1
248c239,240
< innodb_purge_threads	1
---
> innodb_purge_rseg_truncate_frequency	128
> innodb_purge_threads	4
256d247
< innodb_sched_priority_cleaner	19
259,261d249
< innodb_show_locks_held	10
< innodb_show_verbose_locks	0
< innodb_simulate_comp_failures	0
275c263
< innodb_strict_mode	OFF
---
> innodb_strict_mode	ON
279a268
> innodb_temp_data_file_path	ibtmp1:12M:autoextend
283,284c272,273
< innodb_track_changed_pages	OFF
< innodb_undo_directory	.
---
> innodb_undo_directory	./
> innodb_undo_log_truncate	OFF
289d277
< innodb_use_global_flush_log_at_trx_commit	ON
292,293d279
< innodb_use_stacktrace	OFF
< innodb_use_sys_malloc	ON
352a339
> max_recursive_iterations	4294967295
388c375
< optimizer_switch	index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
---
> optimizer_switch	index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on
413c400
< performance_schema_max_statement_classes	178
---
> performance_schema_max_statement_classes	184
465c452,456
< server_id	0
---
> server_id	1
479a471
> slave_parallel_workers	0
508a501
> standards_compliant_cte	ON
520c513,514
< thread_cache_size	0
---
> table_open_cache_instances	8
> thread_cache_size	151
525a520,521
> thread_pool_prio_kickup_timer	1000
> thread_pool_priority	auto
528c524
< thread_stack	295936
---
> thread_stack	297984

These are the results I got:

5.6.345.7.168.0.0-dmr10.0.2810.1.1810.2.2
avg run (seconds)193.378215.2826667186.081194.032195.0173333198.0726667
median (seconds)194.017215.425187.855197.746199.459197.354
rows inserted/s241708.6018217115.1385251186.9885240893.9041239676.7774235979.6876
insert throughput compared to 5.6100.00%89.83%103.92%99.66%99.16%97.63%

If you prefer a graph:
Throughput result comparison, in rows inserted/second, for the LOAD data test with default configuration

Nothing surprising here, for MySQL, the 5.7 regression still shows, now only around 10%; not sure if because of the work done already to mitigate it on BUG#75981, or just the different hardware I used compared to my last test. The good news is that the regressions seems solved for 8.0, getting even better results than 5.6. We will see if that maintains when we normalize the configuration options.

Regarding MariaDB, 10.0 shows a similar performance to 5.6 (probably within the range of potential measurement errors). What it is noticeable is a very small degradation for 10.1 and 10.2. Nothing that worries me, normally more features create a small overhead (the same has happened in MySQL in the past) which is in most cases worth paying, and many times not even noticeable under normal load. There is also still time for 10.2 to optimize potential problems.

WMF-like config

Let’s get a bit more serious, let’s test a configuration used in real life an let’s uniformize configurations. Wikimedia core production databases use this configuration. There are several things, however, I have to comment that will affect this test:

  • The tests here were not done on the same production machines, and some of the configuration has been heavily tuned for large amounts of memory available and hardware RAID controller, not available for the test
  • Some of the configuration options are MariaDB 10-dependent, with our build. They do not work on MySQL, or other versions of MariaDB- in those cases those options were disabled. Some of them work and were kept, but they may have a negative impact in this context.
  • It is obvioulsy not optimized for the specific LOAD DATA use case
  • WMF configuration is MariaDB 10-focused, so I expect MySQL server to not be properly tuned for it.
  • SSL was not used, probably not interesting for a single connection
  • Binary collation is used, as I commented before, it is the chosen charset/collation for mediawiki in WMF servers. As 8.0.0-dmr had the bug I mentioned not allowing that collation, I had to use the default collation there. Beware, that could skew its results

This is the configuration used for this test:

[client]
port   = 3306
socket = /tmp/mysql.sock

[mysqld]

# administrative options
user       = mysql
socket     = /tmp/mysql.sock
port       = 3306
# MariaDB only
#extra-port = 3307
datadir    = /srv/sqldata
basedir    = /opt/mysql
tmpdir     = /srv/tmp
server_id  = 1

skip-external-locking
skip-name-resolve
temp-pool

# Always start in read_only mode.
# Master selection will be handled by orchestration.
read_only = 1

# prevent imports/exports to arbitrary dirs
secure_file_priv               = /tmp
# thread and connection handling
# Mariadb only
#thread_handling                = pool-of-threads
# Mariadb only
#thread_pool_stall_limit        = 100
# Mariadb only
#thread_pool_size               = 32
# Mariadb only
#thread_pool_max_threads        = 2000
max_connections                = 10000
back_log                       = 500
# MariaDB only
# extra_max_connections          = 10
max_connect_errors             = 1000000000
max_allowed_packet             = 32M
connect_timeout                = 3
query_cache_size               = 0
query_cache_type               = 0
event_scheduler                = 1
log-warnings                   = 0
thread_stack                   = 192K
thread_cache_size              = 300
interactive_timeout            = 28800
wait_timeout                   = 3600
transaction-isolation          = REPEATABLE-READ

# binary log an replication options
log-bin
log-slave-updates
sync-binlog                    = 1
binlog-cache-size              = 1M
max-binlog_size                = 1000M
binlog-format                  = STATEMENT
expire-logs-days               = 7
slave-transaction-retries      = 4294967295

# table limits
tmp-table-size                 = 64M
max-heap-table-size            = 64M
table-open-cache               = 50000
table-definition-cache         = 40000
open-files-limit               = 200000

# optimizer options
# MariaDB only
#sql-mode                       = IGNORE_BAD_TABLE_OPTIONS
optimizer_switch               = 'mrr=on,mrr_cost_based=on'
# MariaDB only
# ,mrr_sort_keys=on,optimize_join_buffer_size=on'
# MariaDB only
#use-stat-tables                = preferably

# charsets and collations
character_set_server           = binary
character_set_filesystem       = binary
collation_server               = binary

# InnoDB options
default-storage-engine         = InnoDB
innodb_file_per_table          = 1
innodb_buffer_pool_size        = 12G
innodb_log_file_size           = 2G
innodb_flush_log_at_trx_commit = 1
innodb_flush_method            = O_DIRECT
innodb_thread_concurrency      = 0
innodb_io_capacity             = 1000
# MariaDB only
# innodb_stats_sample_pages      = 16
innodb_stats_method            = nulls_unequal
# MariaDB only
#aria_pagecache_buffer_size     = 1G
# MariaDB only
#join_cache_level               = 8

# dump and load innodb buffer at start and stop
innodb_buffer_pool_load_at_startup  = 1
innodb_buffer_pool_dump_at_shutdown = 1

# profiling and metrics
# Enabling performance_schema (disabled by default in MariaDB10)
performance_schema                                     = 1
# downsizing performance schema memory usage: T99485
performance_schema_max_thread_instances                = 500
performance_schema_max_cond_instances                  = 1000
performance_schema_accounts_size                       = 300
performance_schema_hosts_size                          = 300
performance_schema_events_statements_history_size      = 10
performance_schema_events_statements_history_long_size = 1000
performance_schema_events_waits_history_size           = 10
performance_schema_events_waits_history_long_size      = 1000
performance_schema_events_stages_history_size          = 10
performance_schema_events_stages_history_long_size     = 1000
performance_schema_max_mutex_instances                 = 5000
performance_schema_max_rwlock_instances                = 2000
performance_schema_max_socket_instances                = 500
performance_schema_max_table_instances                 = 1000

[mysqldump]

quick
max_allowed_packet = 32M

As you can see, the biggest changes were:

  • Buffer pool increased to 12GB, so that the whole file can fit into memory
  • Log file size increased to 2GB, so also that the whole file can fit into the transaction log
  • Performance schema enabled by default, this is very important for our query monitoring
  • No pool of connections (not available on MySQL)
  • Very large open_file_limits and max_connections, which requires downsizing the P_S parameters. (Update: I am being told that this is not necessary on 5.7 and 8.0, as it auto-resizes. I still tune it as I have to do it for MariaDB 10.0)

These are the results:

5.6.345.7.168.0.0-dmr10.0.2810.1.1810.2.2
avg (seconds)212.501234.3543333220.1906667 (*)185.5396667187.598210.6946667
median (seconds)212.48234.608220.123 (*)188.16191.393209.287
rows inserted/s219957.2049199446.391212275.6914 (*)251919.8554249155.7799221842.9481
insert throughput compared to 5.6100.00%90.68%96.51% (*)114.53%113.27%100.86%

(*) Using UTF-8 collation because an 8.0.0 bug, which may impact the results

Again, if you prefer some graph bars:
Througput comparison for the LOAD DATA test on a WMF-like configuration

The main conclusions that I can get from here is that my current performance configuration is not MySQL or MariaDB 10.2-ready. While in the case of 10.0 and 10.1, the performance has gotten around 11% better, for MariaDB 10.2 and MySQL, it has gotten worse, not better.

One thing that is not on the above summary is that MariaDB’s results, unlike MySQLs, even if better on average, have a higher amount of execution time variability. All test done on the 3 MySQL versions ended up within a second from each other. MariaDB results end up in a 10-second range. Throughput is not the only variable to take into account when doing performance testing: stability of timings could be important in many cases to provide a stable response time.

The other thing to comment, do these results show a more important regression for 10.2? Let’s do another test to confirm it.

Insecure config

Let’s try to test a configuration that optimizes for LOAD DATA speed, without regards for security (disabling features that you normally wouldn’t want to do in production: double write buffer, innodb checksums, …

This was the configuration used:

[client]
port   = 3306
socket = /tmp/mysql.sock

[mysqld]

# administrative options
user       = mysql
socket     = /tmp/mysql.sock
port       = 3306
datadir    = /srv/sqldata
basedir    = /opt/mysql
tmpdir     = /srv/tmp

#binlog disabled

# prevent imports/exports to arbitrary dirs
secure_file_priv               = /tmp
max_connections                = 500

query_cache_size               = 0
query_cache_type               = 0

# optimizer options
# MariaDB only
sql-mode                       = ''

# charsets and collations
#character_set_server           = binary
#character_set_filesystem       = binary
#collation_server               = binary

# InnoDB options
default-storage-engine         = InnoDB
innodb_file_per_table          = 1
innodb_buffer_pool_size        = 12G
innodb_log_file_size           = 2G
innodb_flush_log_at_trx_commit = 1

# profiling and metrics
performance_schema             = 0
skip-innodb-doublewrite
innodb_flush_neighbors         = 0
innodb_checksum_algorithm      = none

And again, the results:

5.6.345.7.168.0.0-dmr10.0.2810.1.1810.2.2
avg (seconds)146.038167.482172.653147.721152.1256667155.683
median (seconds)146.016167.054171.364147.889151.937155.414
rows inserted/s320061.395279081.4894270722.9298316414.9038307253.3848300232.6908
insert throughput compared to 5.6100.00%87.20%84.58%98.86%96.00%93.80%

Graphically:
Throughput results in rows inserted per second for LOAD DATA test with the insecure configuration

The trend is similar to the previous results, with a 20-25% improvement in performance compared to the last test case. Except one version-8.0 seems to perform worse than on the previous configuration. This doesn’t make sense to me, but the results, like in the previous case, are consistent and repeatable. Is there a regression hidden here?

Regarding MariaDB, this configuration seems more stable (load-wise) for it, with less variability between runs. Still a downwards trend for newer versions.

One more result -SELECT count(*)

While checking that all the records were being inserted correctly, and no rows were lost in the process, I run into a curious case of different count + full scan speeds. When doing:

SELECT count(*) FROM nodes;

I got the following timings on console:

5.6.345.7.168.0.0-dmr10.0.2810.1.1810.2.2
run time (seconds)8.31 sec4.81 sec5.02 sec1 min 47.88 sec2 min 5.08 sec6.61 sec

For those still obsessed on getting a nice bar chart every time:

Execution time, in seconds, of SELECT count(*) FROM nodes;

I knew there were some optimizations on SELECT count(*) that made that faster for InnoDB (not instant, like MyISAM), but I didn’t know it was enabled as early as 5.6. It seems MariaDB only merged that feature (or a similar one doing the same) starting with 10.2.

Conclusions

Providing data without an explanation is a bad practice, however, I believe this will help as a starting point to debug (by me, or any others that would want to reproduce the results- I think I have given enough information to do that) and see what code points are blocking this memory-only, 1-thread concurrency specific test. Again, remember this is a very specific test and not a proper benchmark to base decisions on. The intended goals which were: 1) identify regressions 2) identify bugs 3) get a first quick look at the options changed and 4) start working with the newest versions, all were accomplished.

As a followup, I would like to explore more in depth those potentially observed regressions on 8.0 and 10.2, and see if it is just a case of adapting the configuration or if there is an underlying cause slowing them down. If they happened to be bugs, there should be plenty of time to fix those before a stable release from both vendors. The other thing that I would like to analyze next is this very same load type, but with innodb compression, and compare both the performance and the resulting file sizes.

I hope this has been helpful for someone else other than me, too. I have probably made some mistakes along the way, please tell me so on a comment or @jynus.

Personal Summary of the Percona Live Amsterdam 2015 Conference

perconalive_europe_amsterdam-logoLast week, 21-23 September, it took place the European MySQL Conference, or “Data performance Conference” as this year’s subtitle was “MySQL. NoSQL. Data in the cloud.”. This year, it changed its location from London to Amsterdam and, as most people I talked to agreed, the change was for good. As every year, Percona was the company organizing it, but it had the participation of all the major players in the open source MySQL/MongoDB/Cloud data world. Special mention goes to Booking.com, which had more talks than usual (despite being one of the largest MySQL users out there), and were the hosts of the community diner (and probably one of the main reasons to moving to Amsterdam, as their main HQ is at that city).

While MySQL was still the king in terms of interest, I saw a growing interest for MongoDB, both from Percona (now owning TokuDB, and releasing its own Percona Server version of Mongo) and from Facebook, promoting its project RocksDB (its much improved fork of LevelDB, with both Mongo and MySQL frontends). Both are interesting offerings for write-heavy workloads with better compression ratios than other engines, but lacking of some features to be ready for everybody’s production usage. However, I have high expectations from both projects.

There were also many talks about MySQL high availability models, with different points of view. While some are pushing for its the upstream GTID-based setup, others opt for alternative GTID implementations; others, for the usage of dedicated binlog servers, while others have a hybrid approach.

Obviously, MySQL folks from Oracle were present and gave many first party sessions about the soon-to-be released 5.7 and its new features regarding query optimization, performance_schema, internals, etc. But the best part of the conference was, without doubt, being able to interact with the developers and suggesting features for 5.8, face-to-face. I think many people loved that and it should be a regular thing from now on.

In a more personal way, for me it was great to devirtualize some people for the first time, plus get to catch up with others I already knew about both professionaly and personally. I had the chance to talk to people from all the previously mentioned companies (Percona, Facebook/RocksDB, Booking, Oracle) and also representatives, developers and DBAs from Google, Vividcortex, Solarwind, MariaDB, VMWare, Galera, Pythian, Github, Freelancers and more.

jaime-presentationI was a speaker in one tutorial and a session: “Query optimization: From 0 to 10 (and up to 5.7)” and “MySQL at Wikimedia: How we do relational data at the Wikimedia Foundation”. I had a full room for the first tutorial, while having very good feedback on the second one, with many people wanting to either contribute or apply for a job with us.

I’ve seen that many of the other speakers have not yet uploaded they deck of slides, and I want to encourage to do it (sadly there were so many at the same time that I could not attend to all of them). You can see and download mine at Slideshare, too: Query Optimization Tutorial / MySQL at Wikipedia. Both are released with Creative Commons Attribution-ShareAlike-4.0 licenses and you even can get the source file of the 238-slide tutorial, developed over almost 2 years while I was a freelance DBA, and improve it, remix it or modify it as you want, if you keep it equally free.

Additionally, a reminder that if someone wants to contribute to our (Wikimedia’s) infrastructure, we are glad to accept patches from the community on our puppet repository and also have several remote positions open, including a Database Administrator profile.

See you next year!

DBAHire.com sponsors the “IX Concurso Universitario de Software Libre”

CUSL (Foto de Familia)
Photo: Ana Rey CC-BY-SA-2.0

The “Concurso Universitario de Software Libre” (CUSL, Free Software University Contest), is an initiative similar to the Google Summer of Code, but specifically aimed to the Spanish university and high school students and organized by a group of Free Software University Offices.

As part of the effort to encourage the growth of the free software ecosystem, and introduce young talent to the development of free applications and technologies, DBAHire.com has just become a silver sponsor of the competition, providing resources for the prizes, traveling and accommodation to the students..

The final phase of the competition will take place on the 7-8 May in Zaragoza, and our MySQL consultant Jaime Crespo will deliver on that Friday a short speech in Spanish titled “Free Software ¿Is it profitable?”.

Stop using FLUSH PRIVILEGES

Mermaids have the same probability of fixing your permission problems, but people continue believing in the FLUSH PRIVILEGES myth.
Mermaids have the same probability of fixing your permission problems, but people continue believing in the FLUSH PRIVILEGES myth.

I see suggesting the usage of FLUSH PRIVILEGES every time someone writes a tutorial or a solution to a problem regarding creating a new account or providing different privileges. For example, the top post on /r/mysql as of the writing of these lines, “MySQL:The user specified as a definer does not exist (error 1449)-Solutions” has multiple guilty cases of this (Update: the user has corrected those lines after I posted this article).

It is not my intention to bash that post, but I have seen committing that mistake many, many times. Even if you go to the reference manual for the GRANT command, you will see a comment at the bottom -from a third party user- using GRANT and then FLUSH PRIVILEGES.

Why should I bother? Is executing FLUSH PRIVILEGES an issue? Why is everybody doing it? The reason why that command exists is because —in order to improve performance— MySQL maintains an in-memory copy of the GRANT tables, so it does not require to read it from disk on every connection, every default database change and every query sent to the server. The mentioned command forces the reload of this cache by reading it directly from disk (or the filesystem cache) as the MySQL reference manual itself clearly indicates (having even its own section: When Privilege Changes Take Effect). However, its execution is unnecessary in most practical cases because:

If you modify the grant tables indirectly using account-management statements such as GRANT, REVOKE, SET PASSWORD, or RENAME USER, the server notices these changes and loads the grant tables into memory again immediately.

Then only reason to perform that reload operation manually is when:

you modify the grant tables directly using statements such as INSERT, UPDATE, or DELETE

For most operations, like creating a user, changing its privileges, or changing its password, you will want to use the high-level statements. Not only they are easier to use and they are compatible with a larger number of MySQL versions, but they will also prevent you from making mistakes (of course, remember to setup the NO_AUTO_CREATE_USER sql mode). They even usually work nicely in a MyISAM-hostile environment like a Galera cluster. There are certainly reasons to edit the tables manually- as an administrator, you may want to tweak the privileges in a special way or import the mysql.* tables from elsewhere, so in those cases running FLUSH PRIVILEGES is mandatory. Please note that, as the manual page says, in most cases (e.g. global privileges) changing a user’s grants will only affect new connections, and certainly never to ongoing queries, as privileges are checked at the beginning of the query processing- read the manual page for details.

So, again, why my crusade against the overuse of FLUSH PRIVILEGES, after all, worst case scenario, the same privileges will be loaded again! It is not a question of performance issues. Although, in an extreme case it certainly can be an issue. Check for example the following script, that executes 10 000 CREATE USER statements (this can only be done in a single thread as the grant tables are still in MyISAM format, even in 5.7.6):

def execute_test(port, drop, flush, thread):
   db = mysql.connector.connect(host="localhost", port=port, user="msandbox", passwd="msandbox", database="test")
   for i in range(0, TIMES):
      cursor = db.cursor()
      if (drop):
         sql = "DROP USER 'test-" + `thread` + '-' + `i` + "'@'localhost'"
      else:
         sql = "CREATE USER 'test-" + `thread` + '-' + `i` + "'@'localhost' IDENTIFIED BY '" + DEFAULT_PASSWORD + "'"
      cursor.execute(sql)
      cursor.close()
      db.commit()
      if (flush):
         cursor = db.cursor()
         flush_sql = "FLUSH PRIVILEGES"
         cursor.execute(flush_sql)
         cursor.close()
         db.commit()
   db.close()

The timing for both executions are:

$ time ./test_flush.py
Not flushing
Executing the command 10000 times

real    0m15.508s
user    0m0.827s
sys     0m0.323s

$ ./test_flush.py -d
Not flushing
Dropping users
Executing the command 10000 times

$ time ./test_flush.py -f
Flushing after each create
Executing the command 10000 times

real    2m7.041s
user    0m2.482s
sys     0m0.771s

$ ./test_flush.py -d
Not flushing
Dropping users
Executing the command 10000 times

We can see that using FLUSH PRIVILEGES is 8x slower that not using them. Again, I want to stress that performance is not the main issue here, as most people would execute it only once at the end of each command block, so it wouldn’t be a huge overload. Even if there is some extra read IO load, we must assume that every round trip to the database, and every commit takes some server resources -so that can be extrapolated to any command. Additionally, concurrency issues is not a typical problem for MySQL account creation, as the mysql.user table it not usually (or should not be) very dynamic.

The main issue I have against the overuse of FLUSH PRIVILEGES is that people execute it without really understanding why they do it and what that command actually does. Every time a person has a problem with MySQL privilege systems, the first piece of advice that is given is to execute this command “just in case”. Check, for example, answers on dba.stackexchange like this, this and this (which I have selected among many others), and where the original user was not altering manually the mysql.* tables. The issue is that in most cases this command does nothing, and the real problem lays on the poor understanding of MySQL’s permission model. As the saying tells- when you have a hammer, every problem looks like a nail. People read that that is a proper way to solve permission-related problems, and they pass the “knowledge” on, creating basically the MySQL equivalent of an urban myth.

So, the next time you encounter a problem with a user not being able to log it, or apply privileges to a user, there are many other sources of issues such as: using old_passwords, using a different authentication method than the native passwords, not having the actual privileges or the WITH GRANT OPTION properties to apply them, your server not identifying you with the same user or host than the one you are actually in, using skip-name-resolve so dns entries are ignored, waiting for a new connection for the changes to take effect, … and many other issues that come with authorization and authentication. MySQL grant system is not precisely obvious and perfect (Hello, granting permissions from databases that do not exist?), but taking 5 minutes to read the extensive manual on privileges can avoid you many headaches in the future. TL;TR RTFM

For those people that already know when to use or not to use FLUSH PRIVILEGES, please, next time you find someone overusing it, educate the user on best practices so people no longer relay in magic and urban myths to solve problems, go to reddit/stackoverflow/your favorite social network/etc. and upvote good practices/comment on bad practices. Today it could be FLUSH PRIVILEGES, tomorrow it could be “add OPTIMIZE TABLE in a cron job every 5 minutes for your InnoDB tables” (and yes, that last one was actually found in the wild).

EXPLAIN minor wishlist

EXPLAIN output: filesortWhile we always want better performance and more and larger features for MySQL, those cannot just “magically appear” from one version to another, requiring deep architecture changes and lots of lines of code. However, there are sometimes smaller features and fixes that could be implemented by an intern or an external contributor, mainly at SQL layer, and that could make the MySQL ecosystem friendlier to newbies and non-experts. Making a piece of software easier to use is sometimes overlooked, but it is incredibly important -not everybody using MySQL is a DBA, and the more people adopting it, more people will be able to live from it, both upstream and as third party providers.

Here it is my own personal list of fixes for EXPLAIN messages. If you are an experienced MySQL user you are probably aware of their meaning, but that doesn’t solve the problem for beginners. The reason why I am writing a blog post is to gather opinions on whether they seem important to you or not, and if my way of solving them seems reasonable so that we can submit them as feature requests.

EXPLAIN messages

As a MySQL instructor, the following case happens a lot with new students. You start with a command like this:

mysql> EXPLAIN SELECT b, c FROM test\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: index
possible_keys: NULL
          key: b_c
      key_len: 10
          ref: NULL
         rows: 4
        Extra: Using index
1 row in set (0.00 sec)

So, “Using index” means that an index is being used, right? No, in this case, the type: index is telling us that it is using an index for scanning or accessing the rows (because it is not a type: ALL– although we could get a full row scan and using the index for ordering or grouping them). The Extra: Using index indicates that the index is also used for retrieving the data, without actually needing to read the whole row. This is, as far as I know, commonly referred as Covering index. And that is exactly what I would like to see:

mysql> EXPLAIN SELECT b, c FROM test\G -- edited output
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: index
possible_keys: NULL
          key: b_c
      key_len: 10
          ref: NULL
         rows: 4
        Extra: Using covering index
1 row in set (0.00 sec)

or maybe:

        Extra: Covering index

Another common misunderstanding: Using filesort:

mysql> EXPLAIN SELECT * FROM test ORDER BY length(b)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4
        Extra: Using filesort
1 row in set (0.00 sec)

At this level, I do not care if I am using filesort as an algorithm, and -if I am correct- since 5.6 can also use a priority queue for the sorting algorithm if the number of items is small. Additionally, the “file” in the filesort word can lead to confusion that this requires a temporary table on disk. I do not have a perfect alternative (please provide feedback), but maybe something like the following would be clearer:

mysql> EXPLAIN SELECT * FROM test ORDER BY length(b)\G  -- edited output
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4
        Extra: Additional sort phase
1 row in set (0.00 sec)

or maybe:

        Extra: Not using index for order-by

Another example would be:

mysql> EXPLAIN SELECT a FROM test WHERE b > 3 and c = 3\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: range
possible_keys: b_c
          key: b_c
      key_len: 5
          ref: NULL
         rows: 1
        Extra: Using index condition
1 row in set (0.00 sec)

I understand that the developers didn’t want to confuse us with NDB’s pushed condition, but this output is quite misleading, too. It literally means that “the index condition is being used”, instead of “ICP is being used”. What about:

mysql> EXPLAIN SELECT a FROM test WHERE b > 3 and c = 3\G -- edited output
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test
         type: range
possible_keys: b_c
          key: b_c
      key_len: 5
          ref: NULL
         rows: 1
        Extra: Using index condition pushdown
1 row in set (0.00 sec)

There are many other expressions, but those are the most annoying to me in terms of students’ confusion.

Would you agree with me? Would these changes break applications that may parse EXPLAIN output? What other small things would you change in MySQL output or error messages? I would specially would like to hear from MySQL beginners and people coming from other databases, as the more we have used to it, the more we get accustomed to MySQLisms.

Regarding MySQL 5.6 temporary tables format

A temporary tabledefault_tmp_storage_engine variable was introduced in 5.6.3, allowing the configuration of the default engine for temporary tables. This seems to be in the direction, as I commented before, of making MyISAM an optional engine. In 5.7, a separate tablespace is being created to hold those tables in order to reduce its performance penalty (those tables do not need to be redone if the server crashes, so extra writes are avoided).

However, I have seen many people assuming that because default_tmp_storage_engine has the value “InnoDB”, all temporary tables are created in InnoDB format in 5.6. This is not true: first, because implicit temporary tables are still being created in memory using the MEMORY engine (sometimes called the HEAP engine), while MyISAM is being used for on-disk tables. If you do not trust the reference manual on this, here it is a quick test to check it:

mysql> SELECT version();
+------------+
| version() |
+------------+
| 5.6.23-log |
+------------+
1 row in set (0.00 sec)

mysql> SHOW GLOBAL VARIABLES like 'default%';
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
| default_storage_engine | InnoDB |
| default_tmp_storage_engine | InnoDB |
| default_week_format | 0 |
+----------------------------+--------+
3 rows in set (0.00 sec)

mysql> SHOW GLOBAL VARIABLES like 'tmpdir';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| tmpdir | /tmp |
+---------------+-------+
1 row in set (0.00 sec)

mysql> CREATE TABLE test (id serial, a text);
Query OK, 0 rows affected (0.10 sec)

mysql> insert into test (a) values ('a');
Query OK, 1 row affected (0.06 sec)

mysql> insert into test (a) values ('aa');
Query OK, 1 row affected (0.00 sec)

mysql> insert into test (a) values ('aaa');
Query OK, 1 row affected (0.00 sec)

mysql> SELECT *, sleep(10) FROM test ORDER BY rand();
...

[ec2-user@jynus_com tmp]$ ls -la
total 24
drwxrwxrwt 5 root root 4096 Feb 24 11:55 .
dr-xr-xr-x 23 root root 4096 Jan 28 14:09 ..
drwxrwxrwt 2 root root 4096 Jan 28 14:09 .ICE-unix
-rw-rw---- 1 mysql mysql 0 Feb 24 11:55 #sql_7bbd_0.MYD
-rw-rw---- 1 mysql mysql 1024 Feb 24 11:55 #sql_7bbd_0.MYI
drwx------ 2 ec2-user ec2-user 4096 Feb 24 11:41 ssh-5ZGoXWFwtQ
drwx------ 2 ec2-user ec2-user 4096 Feb 24 11:43 ssh-w9IkW0SvYo

...
+----+------+-----------+
| id | a | sleep(10) |
+----+------+-----------+
| 1 | a | 0 |
| 2 | aa | 0 |
| 3 | aaa | 0 |
+----+------+-----------+
3 rows in set (30.00 sec)

The only thing I have done above is forcing the creation of the temporary table on disk by adding a TEXT field (incompatible with the MEMORY engine, so it has to be created on disk) and using sleep so that we have enough time to check the filesystem. You can see on the output of ls the .MYD and .MYI particular to the MyISAM engine. That last step would be unnecessary if we just used PERFORMANCE_SCHEMA to check the waits/io.

A second, and more obvious reason why thinking that all temporary tables are created in InnoDB format, is because explicit temporary tables can still be created in a different engine with the ENGINE keyword:

mysql> CREATE TEMPORARY TABLE test (i serial) ENGINE=MyISAM;
Query OK, 0 rows affected (0.00 sec)

[ec2-user@jynus_com tmp]$ ls -la
total 36
drwxrwxrwt 5 root root 4096 Feb 24 12:16 .
dr-xr-xr-x 23 root root 4096 Jan 28 14:09 ..
drwxrwxrwt 2 root root 4096 Jan 28 14:09 .ICE-unix
-rw-rw---- 1 mysql mysql 8554 Feb 24 12:12 #sql7bbd_36a3_0.frm
-rw-rw---- 1 mysql mysql 0 Feb 24 12:12 #sql7bbd_36a3_0.MYD
-rw-rw---- 1 mysql mysql 1024 Feb 24 12:12 #sql7bbd_36a3_0.MYI
drwx------ 2 ec2-user ec2-user 4096 Feb 24 11:41 ssh-5ZGoXWFwtQ
drwx------ 2 ec2-user ec2-user 4096 Feb 24 11:43 ssh-w9IkW0SvYo

mysql> DROP TEMPORARY TABLE test;
Query OK, 0 rows affected (0.00 sec)

[ec2-user@jynus_com tmp]$ ls -la
total 20
drwxrwxrwt 5 root root 4096 Feb 24 12:17 .
dr-xr-xr-x 23 root root 4096 Jan 28 14:09 ..
drwxrwxrwt 2 root root 4096 Jan 28 14:09 .ICE-unix
drwx------ 2 ec2-user ec2-user 4096 Feb 24 11:41 ssh-5ZGoXWFwtQ
drwx------ 2 ec2-user ec2-user 4096 Feb 24 11:43 ssh-w9IkW0SvYo

Will this change in the future? 5.7.5 continues to have the same behavior as 5.6. However, as Stewart pointed some time ago, the performance optimizations in 5.7 make some uses of MEMORY and MyISAM obsolete so I will not be surprised if that dependency, together with MyISAM grant tables, will be removed in the future.

Update: I’ve been told by email by Morgan that the yet-to-be-released (at the time of this writing) 5.7.6 will finally change the default behavior to be full InnoDB for implicit temporary tables, too, as seen on the release notes:

InnoDB: The default setting for the internal_tmp_disk_storage_engine option, which defines the storage engine the server uses for on-disk internal temporary tables (see How MySQL Uses Internal Temporary Tables), is now INNODB. With this change, the Optimizer uses the InnoDB storage engine instead of MyISAM for internal temporary tables.

internal_tmp_disk_storage_engine was introduced in 5.7.5, but its default value then was MYISAM.

This is in order to get advantage of the in-memory performance of InnoDB for variable-lengh fields, which I am personally 100% for. Thank you Morgan for the extra information!