Performance impact of MySQL query cache on modern hardware

Recently, Morgan has been writing on deprecating some MySQL features and inspired by that while working on MySQL on POWER, I wondered “What is the impact of the MySQL query cache on modern hardware?”

We’ve known for over six years (since before we started Drizzle) that the query cache hurt performance. It was for that reason that the query cache was one of the early things to be removed from Drizzle, it just didn’t scale on multi core systems that  we were targeting.

So what about modern hardware? While working on MySQL 5.6 on POWER8, I enabled the query cache and ran a benchmark. Enabling the query cache reduced performance by an order of magnitude. I suspect the performance impact is even higher on MySQL 5.7.

My vote? Deprecate it. Deprecate it now, print a giant warning if it’s enabled and at some point just remove it. A single global mutex just doesn’t scale to 4 cores, let alone 24 cores at 8 threads per core.

19 thoughts on “Performance impact of MySQL query cache on modern hardware

  1. The current architecture is inadequate for the high end, or even just four or so concurrent threads usually running.

    But a huge portion of our user base isn’t high end or highly loaded and can still benefit, depending on the workload. Think of something like a blog that gets slashdotted and stays up because the QC is so fast at delivering the hot content. or a departmental server that even on busy days doesn’t have more than a few thousand queries per second to run, so has average concurrency below 1 unless it happens to be running a report. Neither of these is glamorous but they are the sort of thing that we do a lot of.

    In some ways there’s a trap for us to watch out for here, to have too much focus on the high end and not pay enough attention to the more common cases where this still has value.

    Looking beyond that, think of high end workloads and the concurrency at the core of the server and storage engines. Today we’re making great progress on dealing with those bottlenecks but that has its limits. Once we hit those we enter an area where a QC – not the current design – may show value by keeping workload out of the bottlenecked core of the server.

    James Day, MySQL Senior Principal Support Engineer, Oracle

  2. Personally, I’ve preferred to compile it out completely since around 2007 when I couldn’t get it to perform sanely on my quad-core PowerMac G5 machine.
    In production deploys in companies I have worked at, I have preferred to have it compiled out so that there is no chance of accidentally enabling it.

  3. Although at some point, with CMSs coming with caching solutions (in MySQL tables or memcache) and with them having simpler queries to generate a page (for performance), there’s a point where the query cache isn’t going to help there either – and it’d be better to just run the query.

    I think we’re already there for key lookups – but I doubt too many CMSs just do a simple key lookup to generate a page (unfortunately).

  4. Are we delivering enough extra raw performance at the lower concurrency end to make it a net gain even if the QC went away? Could we if we simplified the code by removing the current QC? For PHP applications with lots of connect/disconnect activity the improved connection speeds may well make that answer yes. For others? That’s a tougher question to say yes to but I think there’s at least some prospect of it from raw row processing speed improvements and optimiser improvements even at relatively low concurrency.

    The CMS case is a good one but I’m not sure how widespread caching CMS usage is at the bespoke code on a departmental server or 10,000 to 100,000 user corporate application end of the spectrum that can often be served by one MySQL server plus at least one replication slave for protection.

    What we tend to see is something of a split. The really big places can end up with a very heavy write load due to massive caching at the application or middleware layer; there’s a mid-range with balanced read/write and maybe some caching and then a massive tail that’s far more common than both of those two combined with many more reads than writes and often a lot of ad hoc or not very well optimised queries.

    I’m quite relaxed about deprecating or removing the current QC if there’s some coding advantage or some performance gain, though. Like much of the old code that hasn’t been refactored yet it’s not a clean design, or anything close to it and I don’t think that it will be a big loss.

    For POWER it’s easy enough: we want to help you to scale and please keep on telling us where you hit bottlenecks so we can do something about them.

    timgws, you might want to try using InnoDB’s memcached interface instead of native memcached. It’s durable against power outages and even ten years ago there were cases where the place I was at found that InnoDB could be faster sometimes than memcached, without the benefit of today’s InnoDB memcached API.

  5. Pingback: Optimizing the MySQL Query Cache | Die wunderbare Welt von Isotopp

  6. The query cache made sense for OLAP style systems because you would load data once a day, thus the cache wouldn’t be invalidated that often. And the few queries usually took a long time so having a few of them in the cache could make a real difference.

    But even OLAP systems now are more and more fed with realtime data. So now the cache will get invalidated quite often.

    And with the InnoDB buffer pool (instead of key cache and filesystem cache) and faster disks I believe queries generally take less time to complete so there is less advantage for cached queries.

    And even OLAP systems are getting more and more concurency. (esp. with parallel query tools)

    I think the real solution for OLAP systems is to use summary tables etc.

    There are also some solutions which do caching in the connector which also reduces the number of network roundtrips.

    The query cache could have some bennefits for foreign data wrappers (like the CONNECT engine), foreign tables and slow UDFs. But many of those will probably need to be modified to work together with the query cache (e.g. specify some kind of ttl).

  7. The question is what kind of queries are you running? Are you running more or less the same queries over and over again? Or are you just running mostly different queries.

    It’s really weird to have a feature in MySQL that does the complete opposite of what it is expected to.

  8. I was doing sysbench point select queries. So not only are they repetitive, the results for all of them would fit in the query cache.

  9. Having read the C, we bind all tables in a DB (we have hundreds of identical db’s in our app) to their own key cache (one per db) – which we then make less than the minimum viable size ( 10K, reduced down to 1K ) – this in effect disables it. We *quadrupled* throughput through our system.

Leave a Reply

Your email address will not be published. Required fields are marked *