Is there a open source file based (NOT in-memory based) JDBC driver for CSV files? My CSV are dynamically generated from the UI according to the user selections and each user will have a different CSV file. I'm doing this to reduce database hits, since the information is contained in the CSV file. I only need to perform
HSQLDB allows for indexed searches if we specify an index, but I won't be able to provide an unique column that can be used as an index, hence it does SQL operations in memory.
I've tried CSVJDBC but that doesn't support simple operations like
order by and
group by. It is still unclear whether it reads from file or loads into memory.
I've tried xlSQL, but that again relies on HSQLDB and only works with Excel and not CSV. Plus its not in development or support anymore.
H2, but that only reads CSV. Doesn't support SQL.
You can solve this problem using the H2 database.
The following groovy script demonstrates:
- Loading data into the database
- Running a "GROUP BY" and "ORDER BY" sql query
Note: H2 supports in-memory databases, so you have the choice of persisting the data or not.
// Create the database def sql = Sql.newInstance("jdbc:h2:db/csv", "user", "pass", "org.h2.Driver") // Load CSV file sql.execute("CREATE TABLE data (id INT PRIMARY KEY, message VARCHAR(255), score INT) AS SELECT * FROM CSVREAD('data.csv')") // Print results def result = sql.firstRow("SELECT message, score, count(*) FROM data GROUP BY message, score ORDER BY score") assert result == "hello world" assert result == 0 assert result == 5 // Cleanup sql.close()
Sample CSV data:
0,hello world,0 1,hello world,1 2,hello world,0 3,hello world,1 4,hello world,0 5,hello world,1 6,hello world,0 7,hello world,1 8,hello world,0 9,hello world,1 10,hello world,0