Parsing name of the figure on Beautiful Soup

advertisements

This is my first time posting, so please be gentle. I'm extracting data from trip advisor. The reviews are interpreted with a figure that is represented like this.

<span class="ui_bubble_rating bubble_40"></span>

As you can see, there is a "40" in the end that represents 4 stars. The same happens with "20" (2 stars) etc...

How can I obtain the "ui_bubble_rating bubble_40"? Thank you in advance...


I'm not sure if this is the most efficient way of doing that, but here's how I'd do it:

tags = soup.find_all(class=re.compile("bubble_\d\d"))

The tags variable will then include every tag in the page that matches the regex bubble_\d\d. After that, you just need to extract the number, like so:

stars = tags[0].split("_")[1]

If you want to be fancy, you can use list comprehensions to extract the numbers from every tag:

stars = [tag.split("_")[1] for tag in tags]