I have a var vk_read
from Python HTMLParser
which handle data like this: ['id168233095']
Now I'm trying to collect all data from this var 'vk_read' after script runs in a list. Should be like: ['id168233095', 'id1682334534', 'id16823453', 'etc...']
if vk_read:
vk_ids = []
for line in vk_read:
if vk_read != '':
vk_ids.append(vk_read)
print(vk_ids)
This is the result:
['id168233095']
['id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095', 'id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095', 'id168233095', 'id168233095', 'id168233095']
After some advice code has been changed (see at the end of this post)
if vk_read not in vk_ids:
vk_ids.append(vk_read)
print(vk_ids)
But in this case result is:
['id45849605']
['id91877071']
['id17422363']
['id119899405']
['id65045632']
['id168233095']
That means my vk_read
add itself up to 10 times and then my script starts to add the next one.
Also trying list.insert()
- and have the same result. (!!!)
How can I run this loop to catch all different result in one list after script runs as many times as the data can be found from the parsed file.
Nota bene: I've updated the code as advised for list1.append(list0)
but in my case this method still return the same result as described above. And changed list name to avoid further confusions.
LAST UPDATE Thanks for helping, guys, you`re really push me in right way: same on stackoverflow
The problem appears to be that you are reinitializing the list to an empty list in each iteration:
from html.parser import HTMLParser
import re, sys, random, csv
with open('test.html', 'r', encoding='utf-8') as content_file:
read_data = content_file.read()
vk_ids = []
class MyHTMLParser(HTMLParser):
def handle_starttag(self, tag, attrs):
href = str(attrs)
for line in href:
id_tag = re.findall('/\S+$', href)
id_raw = str(id_tag)
if re.search('/\w+\'\)\]', id_raw):
global vk_read
vk_read = id_raw
else:
break
for ch in ['/', ')', '[', ']', '"', "'"]:
if ch in vk_read:
vk_read = vk_read.replace(ch, "")
# https://stackoverflow.com/questions/30328193/python-add-string-to-a-list-loop
for vk_id in vk_read:
if vk_id not in vk_ids:
vk_ids.append(vk_read)
break
print(vk_ids)
break
N.B. After last changes
print(type(vk_ids))
<class 'list'>
It appears that you are inside a loop, vk_read
is a string that changes at each iteration:
vk_ids = [] ## initialize list outside the main loop
## main loop
for some_variable in some_kind_of_iterator: ## this is just a placeholder, i don't know what your loop looks like.
## get the value for vk_read
vk_read = ...
## append to vk_ids
if vk_read and vk_read not in vk_ids:
vk_ids.append(vk_read)
print vk_ids