PYTHON: Reading in a text file does not work with the delimiter

I have a text file output from gem5 (i.e., I have no control over its format).

It is as such:

    ---------- Begin Simulation Statistics ----------
sim_seconds                                  9.553482                       # Number of seconds simulated
sim_ticks                                9553481748000                       # Number of ticks simulated
final_tick                               9553481748000                       # Number of ticks from beginning of simulation (restored from checkpoints and never reset)
sim_freq                                 1000000000000                       # Frequency of simulated ticks
host_inst_rate                                 911680                       # Simulator instruction rate (inst/s)
host_op_rate                                  1823361                       # Simulator op (including micro ops) rate (op/s)
host_tick_rate                             1669871119                       # Simulator tick rate (ticks/s)
host_mem_usage                                 662856                       # Number of bytes of host memory used
host_seconds                                  5721.09                       # Real time elapsed on the host
sim_insts                                  5215804132                       # Number of instructions simulated
sim_ops                                   10431608523                       # Number of ops (including micro ops) simulated

using csv module I have problems with the whitespace delimited rows. If I delimit with whitespace, all the spaces are read in, if I delimit with \t, it doesn't acknowledge anything at all.

How can I easily deal with these spaces as I just want to read in the left column and the value attributed to it.

Is csv import still suitable or is there something more powerful?


csv.reader can still be relevant for your use case, look at the use of the skipinitialspace parameter in csv.reader

csv.reader(csvfile, delimiter= ' ', skipinitialspace=True)

This will cause the file to be delimited by whitespace, but additional whitespace after the delimiter will be ignored.

r = csv.reader(csvfile, delimiter= ' ', skipinitialspace=True)
for row in r:
    print row

['sim_seconds', '9.553482', '#', 'Number', 'of', 'seconds', 'simulated']
['sim_ticks', '9553481748000', '#', 'Number', 'of', 'ticks', 'simulated']
['final_tick', '9553481748000', '#', 'Number', 'of', 'ticks', 'from', 'beginning', 'of', 'simulation', '(restored', 'from', 'checkpoints', 'and', 'never', 'reset)']
['sim_freq', '1000000000000', '#', 'Frequency', 'of', 'simulated', 'ticks']
['host_inst_rate', '911680', '#', 'Simulator', 'instruction', 'rate', '(inst/s)']
['host_op_rate', '1823361', '#', 'Simulator', 'op', '(including', 'micro', 'ops)', 'rate', '(op/s)']
['host_tick_rate', '1669871119', '#', 'Simulator', 'tick', 'rate', '(ticks/s)']
['host_mem_usage', '662856', '#', 'Number', 'of', 'bytes', 'of', 'host', 'memory', 'used']
['host_seconds', '5721.09', '#', 'Real', 'time', 'elapsed', 'on', 'the', 'host']
['sim_insts', '5215804132', '#', 'Number', 'of', 'instructions', 'simulated']
['sim_ops', '10431608523', '#', 'Number', '...'] `

You can then only use the first 2 values of each row