7 keys to better MySQL performanceOn October 25, 2017 by Zander
One of the ways we measure applications is through performance. One of the metrics for application performance is user experience, which generally translates to “did the user need to wait longer than a reasonable amount of time to get what they wanted.”
This metric can mean different things in different scenarios. For a mobile shopping app, response times can’t be more than a couple of seconds. For an employee’s HR page, responses might be allowed to take a few seconds longer.
We have a lot of research on how performance impacts user behavior:
Whatever the standard, it is essential to maintain good performance for applications. Otherwise, users will complain (or worse, go to a different application). One of the factors that affects application performance is database performance. The interaction between applications, websites, and databases is critical in establishing the level of application performance.
A central component of this interaction is how applications query the database and how the database responds to requests. By any measure, MySQL is one of the most popular database management systems. More enterprises are shifting to MySQL (and other open source databases) as a database solution in their production environments.
There are many methods of configuring MySQL that can help ensure that your database responds to queries quickly, and with a minimum amount of application performance degradation.
The following are some essential tips to help you optimize the performance of your MySQL database.
MySQL optimization key #1: Learn how to use
The two most important decisions you make with any database are designing how relationships between application entities are mapped to tables (the database schema) and designing how applications get the data they need in the format they need it (queries).
Complicated applications can have complicated schemas and queries. If you’re going to get the performance and scale your applications require, you can’t just count on intuition to understand how queries will be executed.
Instead of guessing and hoping, you should learn how to use the
EXPLAIN command. This command shows you how a query will be executed, and gives you insight into both what performance you can expect and how the query will scale with changing data size.
There are a number of tools—such as MySQL Workbench—that can visualize the
EXPLAIN output for you, but you still need to understand the basics to make sense of it.
There are two different formats in which the
EXPLAIN command provides output: the old fashioned table format, and a more modern, structured JSON document that provides significantly more detail (shown below):
mysql> explain format=json select avg(k) from sbtest1 where id between 1000 and 2000 \G
*************************** 1. row ***************************
“attached_condition”: “(`sbtest`.`sbtest1`.`id` between 1000 and 2000)”
One component you should look at is “query cost.” Query cost refers to how expensive MySQL considers this particular query in terms of the overall cost of query execution, and is based on many different factors.
Simple queries generally have query cost of less than 1,000. Queries with a cost between 1,000 and 100,000 are considered medium-cost queries, and generally are fast if you’re only running hundreds of such queries per second (not tens of thousands).
Queries with a cost of more than 100,000 are expensive queries. Often these queries will still run fast when you’re a single user on the system, but you should think carefully about how frequently you’re using such queries in your interactive applications (especially as the number of users grows).
Of course these are ballpark performance numbers, but they demonstrate the general principle. Your system might handle query workloads better or worse, depending on its architecture and configuration.
Chief among the factors that determine query cost is whether the query is using indexes correctly. The
EXPLAIN command can tell you if a query isn’t using indexes (usually because of how the indexes are created in the database, or how the query itself is engineered). This is why it is so important to learn to use
MySQL optimization key #2: Create the right indexes
An index improves query performance by reducing the amount of data in the database that queries must scan. Indexes in MySQL are used to speed up access in the database and help enforce database constraints (such as
Database indexes are much like book indexes. They are kept in their own location, and they contain information already in the main database. They are a reference method or a map to where data is located. Indexes do not change any of the data in a database. They simply point to the location of the data.
There are no indexes that are always right for any workload. You should always look at indexes in the context of the queries that the system is running.
Well-indexed databases not only run faster, but even a single missing index can slow a database to a crawl. Use
EXPLAIN (as recommended previously) to find missing indexes and add them. But be careful: Don’t add indexes you don’t need! Unnecessary indexes slow databases down (check out my presentation on MySQL indexing best practices).
MySQL optimization key #3: No defaults!
Like any software, MySQL has many configurable settings that can be used to modify behavior (and ultimately, performance). And like any software, many of these configurable settings are ignored by administrators and end up being used in their default mode.
To get the best performance from MySQL, it is important to understand the configurable MySQL settings and—more important—set them to work best for your database environment.
By default, MySQL comes tuned for a small-scale development installation, not for production scale. You typically want to configure MySQL to use all of the memory resources available, as well as to allow the number of connections that your application requires.
Here are three MySQL performance tuning settings that you should always examine closely:
innodb_buffer_pool_size: The buffer pool is where data and indexes are cached. This is the main reason for using a system with a large amount of RAM as your database server. If you’re only running the InnoDB storage engine, you typically allocate around 80 percent of your memory for the buffer pool. If you are running very complicated queries, or you have very large numbers of concurrent database connections, or you have very large numbers of tables, you might need to take this value down a notch to allocate more memory for other purposes.
As you set the InnoDB buffer pool size, you need to make sure you don’t set it too large or it will cause swapping. This absolutely kills your database performance. An easy way to check is to look at Swapping Activity in the System Overview graph in Percona Monitoring and Management:
As this graph shows, some swapping is fine every so often. If, however, you see sustained swapping activity of 1MB per second or more, you will need to reduce your buffer pool size (or other memory uses).
If you don’t get the value for
innodb_buffer_pool_size correctly on the first go, don’t worry. Starting with MySQL 5.7 you can change the size of the InnoDB buffer pool dynamically, without restarting the database server.
innodb_log_file_size: This is the size of a single InnoDB log file. By default, InnoDB uses two values so that you can double this number to get the size of the circular redo log space InnoDB uses to make sure your transactions are durable. This also optimizes applying changes to the database. Setting
innodb_log_file_size is a question of trade-offs. The larger the redo space you allocate, the better the performance you will achieve for a write-intensive workload, but the longer the time for crash recovery if your system suffers power loss or other problems.
How do you know if your MySQL performance is limited by your current InnoDB log file size? You can tell by looking at how much of the usable redo log space is actually used. The easiest way is to look at the Percona Monitoring and Management InnoDB Metrics dashboard. In the graph below, the InnoDB log file size is not large enough, as the space used pushes very close to how much usable redo log space is available (indicated by the red line). Your log file size should be at least 20 percent greater than the amount of space used to keep your system performing optimally.
max_connections: Large-scale applications often require much more than the default number of connections. Unlike other variables, if you don’t don’t set this correctly you won’t have performance issues (per se). Instead, if the number of connections is not sufficient for your application needs, your application simply won’t be able to connect to the database (which looks like downtime to your users). Getting this variable right is important.
It can be difficult to know how many connections you need for complex applications with many components running on multiple servers. Fortunately, MySQL makes it very easy to see how many connections are used at peak operation. Typically you want to ensure there is at least a 30 percent gap between the maximum number of connections your application uses and the maximum number of connections available. An easy way to view these numbers is to use the MySQL Connections Graph in the MySQL Overview dashboard in Percona Monitoring and Management. The graph below shows a healthy system, where there is a good number of additional connections available.
One thing to keep in mind is that if your database runs slowly, applications often create an excessive number of connections. In such cases, you should work on the database performance problem rather than simply allowing more connections. More connections can make the underlying performance problem worse.
(Note: When you set the
max_connections variable significantly higher than the default value, you often need to consider increasing other parameters such as the size of the table cache and the number of open files MySQL allows. This goes beyond the scope of this article, however.)
MySQL optimization key #4: Keep the database in memory
We have seen a transition to solid state drives (SSDs) in recent years. Even though SSDs are much faster than spinning hard drives, they are still no match for having data available in RAM. This difference comes not just from the storage performance itself, but also from the additional work the database must do when it retrieves the data from disk or SSD storage.
With recent hardware improvements, it is increasingly possible to get your database in memory—whether you’re running in the cloud or managing your own hardware.
The even better news is that you do not need to fit all of your database into memory to get a majority of the in-memory performance benefits. You just need to fit the working set of data into memory—the data that is accessed most frequently.
You may have seen some articles providing some specific numbers about what portion of the database you should keep in memory, ranging from 10 percent to 33 percent. In fact, there is no “one size fits all” number. The amount of data to fit into memory for the best performance advantage is workload related. Rather than looking for a specific “magic” number, you should check how much I/O the database is running at its steady state (typically a few hours after it is started). Look at reads, because reads can be completely eliminated if your database is in memory. Writes will always need to happen, whatever amount of memory you have available.
Below you can see I/O happening in the InnoDB I/O Graph in the InnoDB Metrics dashboard of Percona Monitoring and Management.
In the graph above, you see spikes as high as 2,000 I/O operations per second, which shows that (at least for some portions of the workload) the database working set does not fit well into memory.
MySQL optimization key #5: Use SSD storage
If your database doesn’t fit in memory (and even if it does) you still need fast storage to handle writes and to avoid performance issues as the database warms up (right after restart). These days fast storage means SSDs.