Continuing on from my previous post, MySQL code size over releases.
I wanted to look at the different branches/patch sets of MySQL out there and work out how far from upstream they deviated. I’m just going to compare against whatever upstream version the most easily accessible version is based on (be it 5.0.x, 5.1.x or whatever).
For MariaDB versions, I removed innodb_plugin and replaced it with xtradb for stats purposes as the MariaDB innodb_plugin is essentially the same as upstream and I don’t want to artificially inflate the diff size.
The first three major versions of MariaDB were all based on MySQL 5.1. I used sloccount and only counted C and C++ code.
So, let’s look at some of the MySQL patch sets/branches that are around. Firstly, let’s look at MariaDB:
Codebase | LoC (C, C++) | +/- from MySQL | +/- from prev maj Version |
MariaDB 5.1 | 1,210,168 | +157,532 | 0 |
MariaDB 5.2 | 1,227,434 | +174,798 | +17,266 (since MariaDB 5.1) |
MariaDB 5.3 | 1,264,995 | +212,359 | +37,561 (since MariaDB 5.2) |
MariaDB 5.5 | 1,377,405 | +187,658 (from MySQL 5.5) | +112,410 (since MariaDB 5.3) |
From my previous post on lines of code in MySQL versions, we learned that with MySQL 5.6 we saw a 354kLOC increase over MySQL 5.5. What is quite surprising is how close some of the MariaDB differences are to this. With MariaDB 5.5, we’re looking at a 187kLOC difference, which is roughly two thirds that of MySQL 5.6. What’s also interesting is that each incremental MariaDB release has not added nearly as much code as the MySQL 5.1 to 5.5 and 5.5 to 5.6 jumps did.
The MariaDB code size has also been increasing, if we look at the graph above  you can really see the jump in code size over the past few releases.
If we look at the delta between MariaDB and MySQL, the first MariaDB release (MariaDB 5.1) was certainly a large jump. Each incremental MariaDB release (5.2 and 5.3) have been a smaller delta than the initial one. With MariaDB 5.5 we actually decrease the delta from MySQL, which is something that’s interesting to look at.
If we were going a straight port of MariaDB 5.3 to be based off MySQL 5.5, we’d expect the delta to be around 137kLOC (what MySQL 5.1 to 5.5 is) but it isn’t. The difference to MariaDB 5.5 from MariaDB 5.3 is only ~112kLOC, and the on the whole delta decreases.
But what makes up this big initial jump for MariaDB? Let’s look at some of the MariaDB 5.1 only modules and what’s left:
MariaDB 5.1 component | LoC (MariaDB 5.1) |
PBXT | 45,107 |
FederatedX | 3,076 |
IBM DB2i | 13,486 |
Total | 61,669 |
Other | 95,863 |
So the MariaDB delta is not increase just because they included some existing modules, there’s more code in there, about as much as any major MySQL version bump.
Tomorrow we look at other MySQL branches, and we see that the MariaDB delta truly is significantly larger than any other MySQL branch.
Nice info. How large are all the code parts? Is one part growing more than others?
Valerii Kravchuk liked this on Facebook.
That would require the code to be modular enough to do that in any fine grained way… We could do some rough calculations though
I am looking forward to the next post as I recall Monty saying MariaDB has 1,000,000 lines changed compared to MySQL. If so, the ~1,288,000 potentially overlapping lines of code would contain an amazing amount of 75% new code. Quite a stunning ratio for a drop-in replacement.
Pingback: Other MySQL branch code sizes | Ramblings
the practifal application of this delta on live and production environnement or application is?
PRactical application to production? Well… each extra line of code incurs a maintenance overhead as well as increasing the amount of code that somebody has to understand to fully know how the system works. This has an implication as to quality of the code base and how easy/hard it is for new developers to come in and make good contributions.
Pingback: So what about the diffstat of MariaDB compared to MySQL? | Ramblings
Pingback: diffstat of MySQL 5.6 versus 5.5 | Ramblings