ScrapeIt

Scrapy Bot which scrapes data from the popular websites. This scrapy bot is generated by popular python package scrapy.

How to start

Initially we have to create a scrapy project.

scrapy startproject projname

Then change directory and generate spider for the website to be scraped.

cd projname
scrapy genspider spiderclassname domain

Note: Domain name should not include https://

Locate the file and start designing the bot.

How to test

Scrapy shell helps us to test the commands and see the reslts instantly.

scrapt shell "URL"

Note:URL should be inside the quotes.

Then you can check the scraped site by typing the command

view(response)

if it return true, the page will be displayed in your default browser.

Data can be scraped using two techniques,

XPath (Recomended).
Css Selector.

How to run

You can clone the repo and navigate to any of the folders and type,

cd projname
scrapy crawl spiderclassname

the output will be rendered as a json on the console. You can also save the output in .json or .csv file by typing,

scrapy crawl spiderclassname -o result.json
scrapy crawl spiderclassname -o result.csv

If you are unable to run the scrapy the project you can still view the ouput of spider in result.csv file located in the folder.

Happy Scraping :)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Dell		Dell
Github		Github
RedmiMobiles		RedmiMobiles
nobel		nobel
quotes		quotes
utr		utr
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScrapeIt

How to start

How to test

How to run

About

Releases

Packages

Languages

niteshkumar2000/ScrapeIt

Folders and files

Latest commit

History

Repository files navigation

ScrapeIt

How to start

How to test

How to run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages