This is my first time posting, so please be gentle. I'm extracting data from trip advisor. The reviews are interpreted with a figure that is represented like this.
<span class="ui_bubble_rating bubble_40"></span>
As you can see, there is a "40" in the end that represents 4 stars. The same happens with "20" (2 stars) etc...
How can I obtain the "ui_bubble_rating bubble_40"? Thank you in advance...
I'm not sure if this is the most efficient way of doing that, but here's how I'd do it:
tags = soup.find_all(class=re.compile("bubble_\d\d"))
tags variable will then include every tag in the page that matches the regex
bubble_\d\d. After that, you just need to extract the number, like so:
stars = tags.split("_")
If you want to be fancy, you can use list comprehensions to extract the numbers from every tag:
stars = [tag.split("_") for tag in tags]