Skip to content

linamy85/arxiv-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arxiv crawler

Usage

Crawl multiple types

  1. Create a file similar to arxiv.cat, including every types you are targeting.
  2. Execute
./run.sh < your_arxiv.cat

And each category will be stored in <category>.json.

Crawl single type

scrapy crawl arxiv -o $type.json

Then input the category and index range you want to crawl (default: 0 - 20000) as asked.

Tool

  • scrapy (python3)

Good Reference

  1. scrapy Selector

About

Arxiv crawler (only abstract)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors