Scraper Lesson 1 — Music
Build a scraper JSON payload for a music category listing
Scraper · Home · Lesson 1 — Music · Lesson 2 — Vacation
Goal
Build a valid JSON payload and scrape the music category. You should understand how url, itemsSelector, fields, pagination, and limits fit together.
Target payload
{
"url": "https://books.toscrape.com/catalogue/category/books/music_14/index.html",
"pagination": {
"type": "nextLink",
"selector": "li.next a",
"attr": "href"
},
"itemsSelector": "article.product_pod",
"fields": {
"title": { "selector": "h3 a", "mode": "attr", "attr": "title" },
"detailLink": { "selector": "h3 a", "mode": "attr", "attr": "href" },
"price": { "selector": ".price_color", "mode": "text" },
"stock": { "selector": ".instock.availability", "mode": "text" }
},
"limits": {
"maxPages": 2,
"maxItems": 30,
"maxConcurrency": 4,
"timeoutMs": 15000
}
}
Tasks
- Run the payload and confirm you get multiple music items.
- Add
ratingClassfromp.star-ratingwithmode: "attr"andattr: "class". - Set
maxPagesto1and note how the result count changes. - In one sentence: why must
itemsSelectormatch each card, not the whole page?