How to merge two tuples in a list if the first elements of tuple match?


I've got two lists of tuples of the form:

playerinfo = [(ansonca01,4,1871,1,RC1),(forceda01,44,1871,1,WS3),(mathebo01,68,1871,1,FW1)]

idmatch = [(ansonca01,Anson,Cap,05/06/1871),(aaroh101,Aaron,Hank,04/13/1954),(aarot101,Aaron,Tommie,04/10/1962)]

What I would like to know, is how could I iterate through both lists, and if the first element in a tuple from "playerinfo" matches the first element in a tuple from "idmatch", merge the matching tuples together to yield a new list of tuples? In the form:

merged_data = [(ansonca01,4,1871,1,RC1, Anson,Cap,05/06/1871),(...),(...), etc.]

The new list of tuples would have the ID number matched to the first and last names of the correct player.

Background info: I'm trying to merge two CSV documents of baseball statistics, but the one with all of the relevant stats doesn't contain player names, only a reference number e.g. 'ansoc101', while the second document contains the reference number in one column and the first and last names of the corresponding player in the other.

The size of the CSV is too large to do this manually (about 20,000 players), so I'm trying to automate the process.

You could first create a dictionary to enable fast ID number look-ups, and then merge the data from the two lists together very efficiently with a list comprehension:

import operator

playerinfo = [('ansonca01', 4, 1871, 1, 'RC1'),
              ('forceda01', 44, 1871, 1, 'WS3'),
              ('mathebo01', 68, 1871, 1, 'FW1')]

idmatch = [('ansonca01', 'Anson', 'Cap', '05/06/1871'),
           ('aaroh101', 'Aaron', 'Hank', '04/13/1954'),
           ('aarot101', 'Aaron', 'Tommie', '04/10/1962')]

id = operator.itemgetter(0)  # To get id field.

idinfo = {id(rec): rec[1:] for rec in idmatch}  # Dict for fast look-ups.

merged = [info + idinfo[id(info)] for info in playerinfo if id(info) in idinfo]

print(merged) # -> [('ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871')]