Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to scrape all customer reviews?

I am trying to scrape all reviews in this website - https://www.backmarket.com/en-us/r/l/airpods/345c3c05-8a7b-4d4d-ac21-518b12a0ec17. The website says there are 753 reviews, but when I try to scrape all reviews, I get only 10 reviews. So, I am not sure how to scrape all 753 reviews from the page, Here is my code-

# importing modules 
import pandas as pd
from requests import get
from bs4 import BeautifulSoup

# Fetch the web page
url = 'https://www.backmarket.com/en-us/r/l/airpods/345c3c05-8a7b-4d4d-ac21-518b12a0ec17'
response = get(url) # link exlcudes posts with no picures
page = response.text

# Parse the HTML content
soup = BeautifulSoup(page, 'html.parser')

# To see different information
## reviewer's name 

reviewers_name = soup.find_all('p', class_='body-1-bold') 
[x.text for x in reviewers_name]

name = []

for items in reviewers_name:
    name.append(items.text if items else None)

## Purchase Data 

purchase_date = soup.find_all('p', class_='text-static-default-low body-2')
[x.text for x in purchase_date]

date = []
for items in purchase_date:
    date.append(items.text if items else None)


## Country 

country_text = soup.find_all('p', class_='text-static-default-low body-2 mt-32')
[x.text for x in country_text]

country = []

for items in country_text:
    country.append(items.text if items else None)


## Reviewed Products 

products_text = soup.find_all('span', class_= 'rounded-xs inline-block max-w-full truncate body-2-bold px-4 py-0 bg-static-default-mid text-static-default-hi')
[x.text for x in products_text]

products = []

for items in products_text:
    products.append(items.text if items else None)

## Actual Reviews 

review_text = soup.find_all('p',class_='body-1 block whitespace-pre-line')
[x.text for x in review_text]

review = []

for items in review_text:
    review.append(items.text if items else None)


## Review Ratings 

review_ratings_value = soup.find_all('span',class_='ml-4 mt-1 md:mt-2 body-2-bold')
[x.text for x in review_ratings_value]

review_ratings = []

for items in review_ratings_value:
    review_ratings.append(items.text if items else None)



# Create the Data Frame 
pd.DataFrame({
    'reviewers_name': name,
    'purchase_date': date,
    'country': country,
    'products': products,
    'review': review,
    'review_ratings': review_ratings
})

My question is how I can scrape all reviews.

like image 336
Sharif Avatar asked Nov 18 '25 17:11

Sharif


1 Answers

You can try (note: the site has "Too many requests" protection, so when you get HTTP status code 429 you have to wait some time before continue):

import time
import requests


url = "https://www.backmarket.com/reviews/product-landings/345c3c05-8a7b-4d4d-ac21-518b12a0ec17/products/reviews"

n, current_url = 1, url
while True:
    response = requests.get(current_url)

    # too many requests?
    if response.status_code == 429:
        print("Too many requests...")
        time.sleep(2)
        continue

    data = response.json()

    for r in data.get("results", []):
        print(n, "-" * 80)
        print(r["comment"])
        n += 1

    next_cursor = data.get("nextCursor")
    if not next_cursor:
        break

    current_url = f"{url}?cursor={next_cursor}"

Prints:


...
748 --------------------------------------------------------------------------------
Ordered space gray but received silver
749 --------------------------------------------------------------------------------
Lately, every item we’ve ordered has had to be returned. I’m pretty bummed.
750 --------------------------------------------------------------------------------
They broke.
751 --------------------------------------------------------------------------------
They’re weren’t noise cancelling like it says in the description so it wasn’t what I wanted but other than that they are great.
like image 65
Andrej Kesely Avatar answered Nov 20 '25 08:11

Andrej Kesely



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!