I'm trying to take a list and compare it to a text file, removing the elements from the list that appear in the text file.
The code I have is:
baselist = open("testfile.txt", 'r')
twolist = ["one","two","three","four","five"]
for y in baselist:
for x in range(0,len(twolist)):
print("Working %s vs %s") % (twolist[x], y)
if twolist[x] == y:
print("Match!")
remove.twolist[x]
baselist.close()
When I run this I can see in the output that it is comparing 'one to 'one', etc and obviously the problem lies in if twolist[x] == y:
but for the life of me I can't get it to work. I've read and read and googled and googled but obviously I'm missing something. Can someone point me in the right direction?
opening files is usually better done with
with
when reading from a file, the newline char is not removed; so, for example,
'two\n' != 'two'
and your comparison-test fails. Use .strip() or .rstrip() to remove whitespace including the trailing newlinefor index in range(len(mylist))
is usually a bad sign; better to operate on a list asfor value in mylist
and filter it as[value for value in mylist if test(value)]
your first
print
statement is indented wronglyyour
remove
syntax is wrong; should betwolist.remove(x)
, and be aware that this only removes the first occurrence of xyour algorithm is O(mn) where m is number of lines in
baselist
and n is number of lines intwolist
; with a bit of care, it could be O(m+n) instead.
If the original order is important,
with open('testfile.txt') as inf:
twoset = set(twolist).difference(line.strip() for line in inf)
twolist = [item for item in twolist if item in twoset]
Otherwise,
with open('testfile.txt') as inf:
twolist = list(set(twolist).difference(line.strip() for line in inf))