Java – Fastest way to iterate through large table using JDBC


I'm trying to create a java program to cleanup and merge rows in my table. The table is large, about 500k rows and my current solution is running very slowly. The first thing I want to do is simply get an in-memory array of objects representing all the rows of my table. Here is what I'm doing:

  • pick an increment of say 1000 rows at a time
  • use JDBC to fetch a resultset on the following SQL query
  • add the resulting data to an in-memory array
  • continue querying all the way up to 500,000 in increments of 1000, each time adding results.

This is taking way to long. In fact its not even getting past the second increment from 1000 to 2000. The query takes forever to finish (although when I run the same thing directly through a MySQL browser its decently fast). Its been a while since I've used JDBC directly. Is there a faster alternative?

Best Solution

First of all, are you sure you need the whole table in memory? Maybe you should consider (if possible) selecting rows that you want to update/merge/etc. If you really have to have the whole table you could consider using a scrollable ResultSet. You can create it like this.

// make sure autocommit is off (postgres)

Statement stmt = con.createStatement(
                   ResultSet.TYPE_SCROLL_INSENSITIVE, //or ResultSet.TYPE_FORWARD_ONLY
ResultSet srs = stmt.executeQuery("select * from ...");

It enables you to move to any row you want by using 'absolute' and 'relative' methods.

Related Question