Python – Scrapy selection id with wildcards

Scrapy selection id with wildcards… here is a solution to the problem.

Scrapy selection id with wildcards

I’m new to Scrapy and Python and I’m trying to make a spider to scrape prices from the magento website. Spiders work fine on non-magento sites, but when it comes to magento, I can get the product name and availability, but I can’t get the price because magento assigns a different id to the class.

Here is the relevant HTML for the Magento website:

<span class="price" id="price-including-tax-1722">

18,60 euros

I tried [starts-with(@class, "price-including-tax-")] and many other options, but none of them seemed to work.

How do I use wildcards to select ID price-including-tax-****?

Solution

You can try CSS

span[id*='price-including-tax']

or xpath

//span[starts-with(@id,"price-including-tax-")]

Keep in mind that XPath will not work if you don’t specify tags (or if you need to use *, which should be avoided).

Related Problems and Solutions