Migrating databases with DBUnit

Posted by    |       Hibernate ORM

In the last few weeks I had to migrate a MySQL database and it turned out to be more difficult than I thought. In the past I've used the tools that ship with MySQL, such as mysqldump and its various options. For the recent migrations that was surprisingly... impossible.

The first migration was from a latin MySQL database to a UTF encoded database. By default MySQL and the JDBC driver all use latin encoding (or they derive it from the system character set), so you better make sure that your database is using UTF8 if you want, for example, Chinese users to be able to store their data. I recommend doing this on a per-table basis, which is easy if you export the schema with Hibernate - just add an extension to your dialect. Also make sure that you set characterEncoding=UTF-8 on your JDBC connection string to initialize the SQL session properly. Note that the useUnicode=true switch is not necessary for MySQL 5.x.

The problem I had was the seamframework.org production database, which was latin encoded when it was created a year ago. I've been pushing migration back because we never had any issue with it and the manual migration with mysqldump and recode turned out not to work for me (some instructions for this if you want to try).

The second migration I was looking at was a migration from MySQL to PostgreSQL, for development and testing purposes. Now, many people use mysqldump for this, then fiddle about with its many command line options and switches (make it ANSI compatible SQL damnyou!) and then close their eyes and pray when they import the dump into Postgres. Well, that didn't work in my case because mysqldump exports bit typed columns as raw binary. You can't make it to export something like true or false or anything that you can import into a Postgres boolean type. The problem here is actually that MySQL (just like the mighty Oracle) doesn't support a true boolean datatype and that Hibernate defaults to creating a bit column for a java.lang.Boolean mapping. In retrospect, Hibernate should probably not do this on MySQL for portability reasons and use a tinyint(1) mapping - on the other hand it is fine if you always stay on MySQL.

So mysqldump didn't work in both cases, I had to find another solution. I solved it with DBUnit and a simple 20 line class:

import org.dbunit.database.IDatabaseConnection;
import org.dbunit.database.DatabaseConnection;
import org.dbunit.database.QueryDataSet;
import org.dbunit.operation.DatabaseOperation;

import java.sql.Connection;
import java.sql.DriverManager;

public class Migration {

    public static final String[] TABLES = new String[]{ "FOO", "BAR", "BAZ" };

    public static void main(String[] args) throws Exception {
        System.out.println("Running Migration...");

        Class.forName("com.mysql.jdbc.Driver");
        //Class.forName("org.postgresql.Driver");

        Connection exportConnection = DriverManager.getConnection(
            "jdbc:mysql://localhost/mydb", 
            "johndoe", 
            "secret"
        );
        IDatabaseConnection exportDatabaseConnection = new DatabaseConnection(exportConnection);

        Connection importConnection = DriverManager.getConnection(
            "jdbc:mysql://localhost/mytarget?characterEncoding=UTF-8&sessionVariables=FOREIGN_KEY_CHECKS=0", 
            "johndoe", 
            "secret"
        );
        IDatabaseConnection importDatabaseConnection = new DatabaseConnection(importConnection);

        for (String table : TABLES) {
        System.out.println("Migrating table: " + table);
            QueryDataSet exportDataSet = new QueryDataSet(exportDatabaseConnection);
            exportDataSet.addTable(table, "SELECT * FROM " + table);
            DatabaseOperation.INSERT.execute(importDatabaseConnection, exportDataSet);
        }

        exportDatabaseConnection.close();
        importDatabaseConnection.close();
        System.out.println("Migration complete");

    }
}

This is the code I used to migrate from MySQL latin to MySQL UTF encoding. For the PostgreSQL migration, uncomment the driver and use a different import JDBC URL. Make sure that you disable foreign key checks for the importing SQL session as you don't know or control in which order tables and rows will be exported and imported.


Back to top