Scraper · Home · Lesson 1 — Music · Lesson 2 — Vacation

Goal

Build a valid JSON payload and scrape the music category. You should understand how url, itemsSelector, fields, pagination, and limits fit together.

Target payload

{
  "url": "https://books.toscrape.com/catalogue/category/books/music_14/index.html",
  "pagination": {
    "type": "nextLink",
    "selector": "li.next a",
    "attr": "href"
  },
  "itemsSelector": "article.product_pod",
  "fields": {
    "title": { "selector": "h3 a", "mode": "attr", "attr": "title" },
    "detailLink": { "selector": "h3 a", "mode": "attr", "attr": "href" },
    "price": { "selector": ".price_color", "mode": "text" },
    "stock": { "selector": ".instock.availability", "mode": "text" }
  },
  "limits": {
    "maxPages": 2,
    "maxItems": 30,
    "maxConcurrency": 4,
    "timeoutMs": 15000
  }
}

Tasks

  • Run the payload and confirm you get multiple music items.
  • Add ratingClass from p.star-rating with mode: "attr" and attr: "class".
  • Set maxPages to 1 and note how the result count changes.
  • In one sentence: why must itemsSelector match each card, not the whole page?

Runner