I am trying to extract some data from a website. However, the site has a hierarchical structure. It has a dropdown menu on the top, whose option valueare URLs. Thus, my approaches are:
- find the dropdown box,
- select an option,
- extract some data, and
- repeat Steps 2 to 4 for all available options.
Below is my code, I am able to extract data under the default selected option (the first one). But I got error Message: Element not found in the cache - perhaps the page has changed since it was looked up
. It seems like my browser was not switched to the new page. I tried time.sleep()
or driver.refresh()
, but failed... Any suggestions are appreciated!
###html
<select class="form-control">
<option value="/en/url1">001 Key</option>
<option value="/en/url2">002 Key</option>
</select>
### python code
# select the dropdown menu
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
# get all options
options = select_box.options
for ele_index, element in enumerate(options):
# select a url
select_box.select_by_index(ele_index)
time.sleep(5)
print element.text
# extract page data
id_comp_html = driver.find_elements_by_class_name('HorDL')
for q in id_comp_html:
print q.get_attribute("innerHTML")
print "============="
update 1 (based on alecxe's solution)
# dropdown menu
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
options = select_box.options
for ele_index in range(len(options)):
# select a url
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
print select_box.options[ele_index].text
select_box.select_by_index(ele_index)
# print element.text
# print "======"
driver.implicitly_wait(5)
id_comp_html = driver.find_elements_by_class_name('HorDL')
for q in id_comp_html:
print q.get_attribute("innerHTML")
print "============="
Your select_box
and element
references got stale, you have to "re-find" the select element while operating the option indexes inside the loop:
# select the dropdown menu
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
# get all options
options = select_box.options
for ele_index in range(len(options)):
# select a url
select_box = Select(driver.find_element_by_xpath("//select[@class='form-control']"))
select_box.select_by_index(ele_index)
# ...
element = select_box.options[ele_index]
You might also need to navigate back after selecting an option and extracting the desired data. This can be done via driver.back()
.