The targeted site is classcentral.com in which I scraped the whole site for the following data points.
- course's subject.
- course's title.
- course's url
I got 16173 records. classcentral.com is listing all the free and partial free online available courses offered by different universities and institutes for the various topics and subjects. also ranking them by their popularity and content.
- install scrapy.
pip install scrapy
- open CLI and go to the project directory. run this command:
scrapy crawl subjects
- if you want data for a specific subject then:
scrapy crawl subject -a subject="programming"
- if you want to save scrape data in any of format i.e. CSV, XML, and
JSON then used this command.
scrapy crawl subject -o data.csv
scrapy crawl subject -o data.json