File-based JDBC driver (not in memory) for CSV files


Is there a open source file based (NOT in-memory based) JDBC driver for CSV files? My CSV are dynamically generated from the UI according to the user selections and each user will have a different CSV file. I'm doing this to reduce database hits, since the information is contained in the CSV file. I only need to perform SELECT operations.

HSQLDB allows for indexed searches if we specify an index, but I won't be able to provide an unique column that can be used as an index, hence it does SQL operations in memory.


I've tried CSVJDBC but that doesn't support simple operations like order by and group by. It is still unclear whether it reads from file or loads into memory.

I've tried xlSQL, but that again relies on HSQLDB and only works with Excel and not CSV. Plus its not in development or support anymore.

H2, but that only reads CSV. Doesn't support SQL.

You can solve this problem using the H2 database.

The following groovy script demonstrates:

  1. Loading data into the database
  2. Running a "GROUP BY" and "ORDER BY" sql query

Note: H2 supports in-memory databases, so you have the choice of persisting the data or not.

// Create the database
def sql = Sql.newInstance("jdbc:h2:db/csv", "user", "pass", "org.h2.Driver")

// Load CSV file
sql.execute("CREATE TABLE data (id INT PRIMARY KEY, message VARCHAR(255), score INT) AS SELECT * FROM CSVREAD('data.csv')")

// Print results
def result = sql.firstRow("SELECT message, score, count(*) FROM data GROUP BY message, score ORDER BY score")

assert result[0] == "hello world"
assert result[1] == 0
assert result[2] == 5

// Cleanup

Sample CSV data:

0,hello world,0
1,hello world,1
2,hello world,0
3,hello world,1
4,hello world,0
5,hello world,1
6,hello world,0
7,hello world,1
8,hello world,0
9,hello world,1
10,hello world,0