JVM access point options for calculating a large graph measure: garbage collection

advertisements

As a part of my code i need to calculate some centrality measure for a graph with 70k vertices and 700k edges. For this purpose I used array and hash map data structures. Unfortunately I ran out of memory at the middle of program. What would be the best JVM Hotspot parameters for handle this situation? Here is the exception i got:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.HashMap.createEntry(Unknown Source)
    at java.util.HashMap.addEntry(Unknown Source)
    at java.util.HashMap.put(Unknown Source)

So i change the heap size with -Xmx6g but this parameter did not solve problem. I still have same problem with heap space.

In my program i want to calculate some measure for each node, unfortunately JVM keep information for all nodes and try to calculate it per each node. I want to know is there any way for changing JVM in a way that remove unneeded information from memory? For example my code crash after calculating measure for 1000 nodes from 70000 nodes. Is there anyway to remove information related to this 1000 nodes from memory after calculation? Memory could assign to other nodes in this way.Is this related to garbage collector? here is my code(which is using JUNG library)

public class FindMostCentralNodes {
    private DirectedSparseGraph<Customer, Transaction> network = new DirectedSparseGraph<Customer, Transaction>();
    static String dbName="SNfinal";
    private int numberofNodes=0;
    public static void main(String[] args) throws NumberFormatException, SQLException {
        FindMostCentralNodes f=new FindMostCentralNodes();
        int counter=1;
        DirectedSparseGraph<Customer, Transaction> tsn=f.getTSN();
        DistanceCentralityScorer<Customer,Transaction> scorer=new DistanceCentralityScorer<Customer,Transaction>(tsn,false,true,true);// un-weighted
        Collection<Customer> subscribers=tsn.getVertices();

        for(Customer node:subscribers){
            String sql="update Node set dist_centrality='"+scorer.getVertexScore(node)+"' where subscriber='"+node.getName()+"'";
            DatabaseManager.executeUpdate(sql,dbName);
            System.out.println("Update node centrality measures successfully!: "+counter++);
            node=null;
        }
    }
    public DirectedSparseGraph<Customer,Transaction> getTSN() throws NumberFormatException, SQLException{
        network= new DirectedSparseGraph<Customer,Transaction>();
        String count="select count(*) as counter from Node";
        ResultSet rscount=DatabaseManager.executeQuery(count, dbName);
        if(rscount.next()) {
            numberofNodes=rscount.getInt("counter");
        }
        Customer [] subscribers=new Customer[numberofNodes];
        String sql="select * from Node";
        ResultSet rs=DatabaseManager.executeQuery(sql, dbName);
        while(rs.next()){
            Customer sub=new Customer();
            sub.setName(rs.getString("subscriber"));
            network.addVertex(sub);
            subscribers[rs.getInt("nodeID")-1]=sub;
            sub=null;
        }
        String sql2="select * from TSN";
        ResultSet rs2=DatabaseManager.executeQuery(sql2, dbName);
        while(rs2.next()){
            Transaction transaction=new Transaction(Double.parseDouble(rs2.getString("weight")));
            network.addEdge( transaction, subscribers[rs2.getInt("callerNID")-1], subscribers[rs2.getInt("calleeNID")-1] );
            transaction=null;

        }
        //garbage
        rscount=null;
        rs=null;
        rs2=null;
        subscribers=null;
        return network;
    }

}


The garbage collector will remove any objects which are no longer reachable from live variables in your program. It will remove any such objects before giving up and throwing an OutOfMemoryError. If you think too many objects are being retained in memory, then the first course of action is to let go of any objects you don't need, so that they are no longer reachable. Since you haven't shown us any code, we can't suggest any specific changes you could make.

If you trim the unnecessary objects, but still don't have enough memory, you could investigate the use of more compact ways to store data. A key technique is the use of off-heap storage; this is more work than simply using objects, but can be more efficient in terms of both space and CPU if it is done correctly. See: