IST Database Administration Site Navigation:
Breadcrumb Trail:
IST Database Administration > Services >

Local Navigation:

Selecting Your Database Platform

If you are thinking about selecting a database, the information below may help in your decision. The factors below are not all equal in importance in making this decision. You should decide how to weight these concerns for your own particular situation.

General Considerations

When deploying a vendor-purchased or open source application

If you are deploying a vendor or open source application, selecting the database that the application best supports will greatly reduce both initial setup time and ongoing support costs.

With a little research, determining the best supported database for a given open source application is usually easy. Rich sources for this imformation include Google, the open source project web page, popular support forums or the mailing list for the software package.

For vendor packages, the sales team for the vendor will often claim they support all databases equally well. In our experience, this is almost never true. Application vendors usually start by supporting one database platform, and have a longer track record shaking out bugs and increasing performance on this database. Adding additional supported databases later usually means support for the new database platforms is not as mature, at least initially. Of course there are exceptions.

You will want to press the vendor for answers to questions such as:

Validate the answers given by the vendor independently. Ask to have access to the vendor's online support forums. Use search engines to find information. Solicit customer references, and speak directly with these customers to see what issues they have encountered.

It is best to do this research before buying the product. You never know what you might find out.

When building an application in-house

If you are doing in-house development, the database choice is more in your own hands.

Some development platforms, such as Ruby on Rails, try to be database agnostic, and treat each database as a simple persistent store. Database features are usually reinvented in framework or developer code. For these tech stacks, the cost of maintaining the database may be the deciding factor in selecting your database.

Other development tech stacks allow you to fully leverage the power of your RDBMS, and as such, knowing and understanding the underlying database is more important. In this case, what database your IT staff already knows might influence your decision more than the cost of the database service alone. Training and ramp-up time, in addition to existing tools, can be considerable.

Though there is an ANSI SQL standard, databases -- when their power is fully utilized rather than reinventing this in application code -- are no more interchangeable than programming languages. Besides the obvious divergence from and extensions to the SQL standard, the concurrency models can dramatically differ, various implementation decisions have vastly different performance implications, security models differ, and so forth. The less trivial the application, the more "database independence" is either a myth, or else paid for with considerable additional development and testing efforts.

Against what your staff knows or prefers, you should balance marketplace trends. Which databases are popular and why? If you google for help on a database, which can you find information on easily? If you walk into a bookstore, which has books available? If you search for job postings to gauge demand, which has the greatest demand? Presumably higher demand creates market forces to produce people more skilled in the high demand areas, and so finding future employees or contractors may be easier.

When integrating with other systems

If you plan on integrating with other applications on campus, the choice of database may matter.

If the other applications provide an API to access their data, then integration concerns are less important. This is the current industry direction, which you may hear referred to as a service-orientied architecture (SOA) or web services. Direct database-to-database integration tends to produce brittle, intertwined systems that cost more to maintain over time. The efforts in the past to eliminate "spaghetti code" has now moved up a level in an attempt to eliminate "spaghetti systems". Further, in a large distributed environment such as ours, it is unrealistic to design against the inevitability of hetergeneity of systems. An SOA approach to system integration produces cleaner, orthogonal interfaces between systems, and keeps systems more loosely coupled than direct db-to-db integration.

However, if direct integration with the other application's database is your only option, you will find this job easier if you match database types. In this case, you will be able to make use of native database bulk loading tools, avoid data type translation issues, and allow your staff to focus on knowing only one type of RDBMS.

If you plan on integrating directly with UC Berkeley's enterprise systems, please talk to us in advance.

When comparing database features

We list this last for a reason. Unless you are doing something very specialized, this is almost always the least important factor.

There can be some areas where this will matter, however. If you are not a Windows shop, doing development against SQL Server is only possible if you are using Java or making exclusive use of service endpoints. Otherwise, it's a very awkward fit. Oracle's support of platforms is vast, but if you run a *BSD operating system, you will have problems. If you plan on having very large tables, table partitioning is useful or sometimes even essential. Oracle and SQL Server have terrific support for this, but require the Enterprise license. PostgreSQL support for partitioning works, but is very quirky compared to the commercial competition.

There are plenty of websites out there which list features and have checkboxes for whether a given database supports a given feature or not. Be wary of these comparisons. Often the writers of such articles know one database much better than the rest, and do not understand how other databases can accomplish the same "feature" in a different way. If you have any questions you'd like to discuss about feature comparisons, we are happy to talk those over with you.

Specific Databases

MySQL

Originally designed as a very lightweight database-like engine, MySQL did not begin with the intent to be a general purpose RDBMS. As such, in versions 3.x and 4.x, it lacked many features considered essential for a relational database. Even so, it was useful as a persistent data engine for purposes that didn't require standard RDBMS features. Not supporting core RDBMS features did mean MySQL was simpler to setup and run.

In recent years, MySQL has changed directions. MySQL, Inc. has chosen to turn MySQL into a full-featured RDBMS. While MySQL is increasingly popular, it is not as mature a relational database as the rest. It will likely take many years before MySQL approaches the sophistication of an engine like Oracle or SQL Server. Whether this matters to you is another question.

Because of MySQL's origins, it also suffers from what can be described as "conceptual integrity problems". It is not uncommon for MySQL to reverse itself on design decisions, even in a minor release. On the other hand, this problem seems to be slowly improving.

Probably the biggest testament to the potential effectiveness of MySQL is that most of the top 10 busiest websites in the world utilize MySQL as their backend.

Oracle

Once described as "the aircraft carrier of databases", Oracle has the richest feature set and the largest marketshare of any of our databases. The list of what Oracle can't do that another database can is a very short list. Oracle supports a huge array of operating system types, for clients as well as servers. Practically every programming language has bindings for Oracle, and Oracle provides a free client for development and deployment. Server-side development programming is rich. Oracle performance is first rate. Most vendor packages support Oracle, and when open source packages support a commercial database engine, often it is Oracle.

The downside of Oracle is cost and management complexity. If you are using our services, management complexity is our problem, not yours. Oracle has been making great strides in simplifying this complexity while retaining the power of their engine over their last several major releases. Lastly, we are actively looking at ways to reduce the cost of our Oracle service.

PostgreSQL

As an open-source "rival" to MySQL, PostgreSQL has the reputation as being complex. This is likely a legacy left over from earlier days of MySQL, before their aspirations to become a full featured RDBMS. Since then, MySQL has of necessity become more complicated, and PostgreSQL has worked at streamlining management of PostgreSQL clusters.

The PostgreSQL slogan is "the world's most advanced open source database". The slogan is well-applied. This engine does not suffer at all from MySQL's "conceptual integrity problems". The core RDBMS functions are very mature, and the optimizer is far more intelligent than MySQL's. PostgreSQL runs on a very wide array of platforms, and has bindings for many popular languages. Server-side programming is dramatically richer than MySQL, and performance scales better. In many ways, PostgreSQL is Oracle-lite.

The major disadvantage for PostgreSQL is market share. In the last few years, MySQL has made some missteps that has won PostgreSQL more mindshare in the open source world. However, it is nowhere near as popular as any of the other three database types, and has a comparatively tiny userbase. While open source package support is rapidly increasing for PostgreSQL, it is rather uncommon in vended packages.

SQL Server

Originally, SQL Server started life as Microsoft's port of Sybase to the Windows platform.

Since this initial partnership with Sybase, Microsoft has continued to advance the capabilities of SQL Server, with very good results. Release 9 (aka SQL 2005) of the product has produced an engine that is very sophisticated, approaching that of Oracle in features and depth. The cost is comparatively modest. Performance is excellent, and the optimizer is advanced.

A considerable disadvantage for SQL Server is the platform limitation. Unless you are developing in Java, a Microsoft server-side tech stack (web and/or app server, programming language, etc) is practically a requirement.