Stored Procedures/Functions for Drizzle

Previously, in “Thoughts on Thoughts on Drizzle” I theorized that one of the major reasons why we did not see lots of people jumping at stored procedures in MySQL was that it wasn’t in their native language (for lack of a better term). We’ve seen External Language Stored Procedures for MySQL that let you write stored procedures in some other languages…. but I felt something was missing.

Firstly, I wanted a language I was really familiar with and comfortable writing complex things in.

Secondly, it should be compiled so that it runs as fast as possible.

Thirdly, it shouldn’t just be linking to a pre-compiled library (drizzle function plugins do that already)

So… the obvious choice was C.

I have a really, really, really early prototype:

drizzle> SELECT LIBTCC("int foo(char* s) { s[0]='4'; s[1]='2'; s[2]=0; return 2; }") AS RESULT;

+--------+
| RESULT |
+--------+
| 42     |
+--------+
1 row in set (0 sec)

or… a bit more sophisticated:

drizzle> SELECT LIBTCC("#include <string.h>\nint foo(char* s) { strcpy(s,\"Hello World!\");; return strlen(s); }") AS RESULT;

+--------------+
| RESULT       |
+--------------+
| Hello World! |
+--------------+
1 row in set (0 sec)

I’m using a function as a bit of a cheat… but the string is passed to libtcc (modified so it’s a shared library so I can load it into drizzle), where it is compiled into native object code (in my case x86-64) and then run.

With the right bits of foo… I could allow calling of all sorts of server functions…. such as those to execute SQL inside the current transaction context.

There are a number of reasons why this is Pure Evil(TM):

  • It executes inside the address space of your database server
    one null pointer dereference and your database server is all gone.
  • It’s arbitrary code injection by design
    Exactly how insane are you? Security–;
  • While great for me and my C hacking friends, possibly not for web app developers, who likely aren’t writing their apps in C every day.
  • See the first reason. Is that not reason enough? Memory protection is a good thing yo.

Anyway, you can see the code up on launchpad in the drizzle-libtcc-function branch. You’ll need to modify your tcc source so that the Makefile snippet for libtcc.o looks like this:

# libtcc generation and test
libtcc.o: $(NATIVE_FILES)
        $(CC) -fPIC -o $@ -c libtcc.c $(NATIVE_TARGET) $(CFLAGS)

libtcc.a: libtcc.o
        $(AR) rcs $@ $^

libtcc.so: libtcc.o
        $(CC) -shared -Wl,-soname,libtcc.so.1 -o $@ libtcc.o

14 thoughts on “Stored Procedures/Functions for Drizzle

  1. Pingback: A MD5 stored procedure for Drizzle… in C | Ramblings

  2. Beautiful, but violates good fences make good neighbors policy. Lets use Lua.

  3. …but I wanted a language I use all the time and are familiar with… and Lua would mean I’d have to learn Lua. I already knew C :)

  4. I almost feel the same way about CMake. I already don’t know automake/autoconf and now I have to not know CMake. I am sure it is better, but I don’t always want to adapt.

  5. Stored procedure language used in MySQL is weird, but at least it is consistently weird, the same FORTRAN/COBOL-like syntax as SQL itself.

    Having an external language for it is fine, as long as it is not C:)
    But then a program should look like normal client program, except there is no need to connect to the database. This is where I feel MySQL’s WL#820 was lacking, SP acts like real client program, it would open a connection and talk to a currently running process (mysqld) using socket ;)

  6. You’re a modern Prometheus.

    I had a similar idea circa 1997, to link a UDF to libperl and allow it to execute arbitrary Perl expressions in the DBMS (InterBase in my case). But I had the sense to quickly discard the idea without coding it.

    There is literally no task you could do with that mechanism that justifies the extreme risk of having it enabled in your server.

  7. It’s a nice hack, but I don’t think this can fly. Your post title mentions *stored*. This is not stored – it’s more like a user-defined C expression, something like that. Useful perhaps, but little more than a UDF, and a messy one at that :)

    I don’t mind so much the idea using C, C++, Java, etc as a db programming language. But unless you can integrate it with the really db side of things it’s not that useful IMO – you’d be better of with a UDF.

    What I mean with “integrate with the db side of things” is:
    – execute a query without instantiating an entire client from scratch inside the code
    – interact with resultsets
    – search and enumerate database metadata

    In addition, the programs should be “stored” – by that I mean, code and binary should be stored in the db and play nice with replication and backup.

  8. Stored would come later… not too hard (just whack it in a table somewhere).. and “user-defined C expression” (while accurate) just isn’t as sexy for something so evil :)

    as for executing queries… well.. you probably could with this right now, but with a couple of helpful wrappers you certainly could.

  9. Stewart, fair enough…after re-reading, my comment seems more negative than I intended.

    Well, it’d be interesting to see how it develops. Godspeed :)

  10. Pingback: Log Buffer #183, a Carnival of the Vanities for DBAs | The Pythian Blog

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.