Deleting items from a list that appears in an external text file

advertisements

I'm trying to take a list and compare it to a text file, removing the elements from the list that appear in the text file.

The code I have is:

baselist = open("testfile.txt", 'r')
twolist = ["one","two","three","four","five"]
for y in baselist:
    for x in range(0,len(twolist)):
        print("Working %s vs %s") % (twolist[x], y)
        if twolist[x] == y:
            print("Match!")
            remove.twolist[x]
baselist.close()

When I run this I can see in the output that it is comparing 'one to 'one', etc and obviously the problem lies in if twolist[x] == y: but for the life of me I can't get it to work. I've read and read and googled and googled but obviously I'm missing something. Can someone point me in the right direction?


  • opening files is usually better done with with

  • when reading from a file, the newline char is not removed; so, for example, 'two\n' != 'two' and your comparison-test fails. Use .strip() or .rstrip() to remove whitespace including the trailing newline

  • for index in range(len(mylist)) is usually a bad sign; better to operate on a list as for value in mylist and filter it as [value for value in mylist if test(value)]

  • your first print statement is indented wrongly

  • your remove syntax is wrong; should be twolist.remove(x), and be aware that this only removes the first occurrence of x

  • your algorithm is O(mn) where m is number of lines in baselist and n is number of lines in twolist; with a bit of care, it could be O(m+n) instead.

If the original order is important,

with open('testfile.txt') as inf:
    twoset = set(twolist).difference(line.strip() for line in inf)

twolist = [item for item in twolist if item in twoset]

Otherwise,

with open('testfile.txt') as inf:
    twolist = list(set(twolist).difference(line.strip() for line in inf))