Scraper Lesson 2 — Vacation
Customize scraper JSON for a travel/vacation listing
Scraper · Home · Lesson 1 — Music · Lesson 2 — Vacation
Goal
Scrape the travel category and practice renaming fields keys for a vacation-style dataset.
Target payload
{
"url": "https://books.toscrape.com/catalogue/category/books/travel_2/index.html",
"pagination": {
"type": "nextLink",
"selector": "li.next a",
"attr": "href"
},
"itemsSelector": "article.product_pod",
"fields": {
"destinationTitle": { "selector": "h3 a", "mode": "attr", "attr": "title" },
"listingLink": { "selector": "h3 a", "mode": "attr", "attr": "href" },
"budgetPrice": { "selector": ".price_color", "mode": "text" },
"status": { "selector": ".instock.availability", "mode": "text" }
},
"limits": {
"maxPages": 2,
"maxItems": 30,
"maxConcurrency": 4,
"timeoutMs": 15000
}
}
Tasks
- Run the starter payload and verify all four fields appear in the response.
- Rename
budgetPricetotripCostand run again. - Add
ratingClassfromp.star-rating(mode: "attr",attr: "class"). - Break one selector on purpose, observe the error, then fix it for a clean final run.
Reflection
Answer briefly: (1) hardest field to model, (2) why limits matter, (3) what you would change if there were no next-page link.