Monday, March 16, 2009

My Mysql / PostgreSQL experience

GEEK CONTENT: High

A few months ago I discovered the Portland Postgres User Group (PDXPUG). This group isn't huge, but is a very active, extremely friendly and helpful group of database hackers and general technophiles. I can't even count the number of times I've tweeted a question about PostgreSQL and gotten a faster response than the IRS on Jimmy Hoffa's trail. If you are a twitter user, simply include the Hash Tag #PDXPUG and the Djinni appears with your answer.

Anyway enough of my plug of the group.

Last month it was my turn to present something to the user group. Not being as knowledgeable of database internals as Selena, nor as experienced hacking tools around PostgreSQL as Mark or Gabrielle I was a little intimidated. It turns out, though, that one of my work projects provided a great talking catalyst.

I am a programmer for NVIDIA. No, I am not one of the amazingly brilliant people that can actually visualize the circuitry inside a GPU. I am not even important enough to be in the know on all the best Top Secret Insider stuff, but I love working around smart people and playing with the toys they come up with.

I like to describe my position as the grease that helps important gears turn faster. Whatever software or analysis that one of the NVIDIA engineers needs that we can't buy a solution for off the shelf my team writes. This might be something to help managers plan their hiring practices or something to help lots of engineers share Gigabytes of test results. It might be something the automatically notifies people when one of our proprietary tools appears to be going awry. I enjoy it because every new project is fresh and challenging.

Recently I was given the task to optimize a database system that logs errors with any of our internal tools, including our large farm of servers. The data stored there helps find systematic problems (bad network nodes, slow disk drives, etc.) The database that stores this information runs MySQL and was written about the same time I started working at the company. It has an awful lot of data and when something goes wrong we need to get answers out of it quickly.

My first attempt at gathering some troubleshooting metrics took me four days. This was not an acceptable turn around, so I undertook a rather in depth effort to redesign the database to better meet the demands placed on it. What I initially imagined as a 4-6 week project turned into a five month saga. I learned an awful lot about the tools that use this and even more about MySQL's inner workings and pitfalls.

As I worked on this I keep a running log of design possibilities for porting this to PostgreSQL. It would be a lot of work, but between my IT contact's preference for PostgreSQL and the activity level of PDXPUG it might well pay off.

So when it came time for me to present something to the group I decided this project would provide good fodder. It went much better than I could have imagined. Not only were my challenges shared by others in the group but I got some really useful suggestions both in PostgreSQL and MySQL for working around them. I look forward to presenting a follow-up after I have been able to incorporate them and some of my other plans.

Here is a basic checklist of my current thought regarding the two database implementations based on my current work. I fully expect this list to change over the next few weeks as I do some more broad comparisons between these solutions:

MySQL
CON:
  • Centralized Lock table shared by all databases across the server instance. One misbehaving database can bring all databases on the server to a halt by filling the lock table.
  • Backups are less sophisticated (no incremental backups)
  • Permissions granularity on our version either requires super-user access (against IT policy) or prevents many necessary operations (such as trigger creation)
  • No Local User Groups in PDX
  • Requires deterministic defaults for database fields (no functions such as now()). This means all other defaults must be done with triggers.
  • Significant breaking changes between versions 5.0.x and 5.1.x. This makes IT really reluctant to upgrade, thus making all of the advantages of 5.1.x unavailable.
  • Lots of confusion about which is the "official" current version. Many conflicting development branches.

PRO:
  • Free*
  • Very responsive support from Sun if you pay for the enterprise package.
  • Documentation. the online documentation (with its accompanying comments) is very thorough. It includes a lot of sample code, as well. There are a lot of good books for configuring and maintaining MySQL servers. There are also some great bloggers out there
  • LAMP (Linux Apache MySQL PHP) has a huge following. PHP appears to be the heart and soul behind MySQL's popularity.

PostgreSQL
CON:
  • No single point of contact for Enterprise support. Many managers like knowing whose throat they can wring when something goes horribly wrong, so a lack thereof can make this solution a harder sell.
  • No Unsigned integers (though I hear this is soon to be).

PRO:
  • Free*
  • Documentation
  • Nearly any language can be used to implement stored functions / procedures. This means you can leverage Perl for string parsing, for example, which it does extremely more efficiently than databases.
  • PDXPUG is an excellent help resource
  • Very active development community. I first saw @MagnusHagander quote this from a Percona discussion:
MySQL has a community. PostgreSQL is a community.

Another interesting point that came up after my presentation was Oracle's announcement that it would purchase MySQL. Is this a Pro or a Con for MySQL? Time will tell.
The final result of my presentation can be seen here (warning lots of geek-oriented material inside):



*Free is a relative term. My first car was a 1974 Ford Pinto station wagon. My step-dad gave it to me when I turned 16. At the time it had no compression in one cylinder and partial in another. I spent the greater part of a summer rebuilding that engine under the careful guidance of my boss at the gas station where I worked. Was the car free? Yes. Was it without cost? No. It cost me a lot of hours, bloody knuckles, and money for tooling and parts. It would have cost a heck of a lot more had I not had a generous boss who let me use his tools, draw on his expertise, and commandeer an empty bay when he could afford it.

2 comments:

Krista Lou Cook said...

Although your work at times fascinates me. I have to admit that much of it goes way up and over my head! :)

Chris said...

It's ok. I am just glad my project proved interesting to those with even more experience than I have.