DigitalPebble Ltd
Pinned Loading
Repositories
Showing 10 of 29 repositories
- carbonara Public archive
Enrichment pipeline for CUR / FOCUS reports which adds energy and carbon data allowing to report and reduce the impact of the your cloud usage.
DigitalPebble/carbonara’s past year of commit activity - benchmark Public
StormCrawler topology to evaluate the performance of different backends and configurations
DigitalPebble/benchmark’s past year of commit activity - crawlurlfrontier Public archive
Crawl config used to test URL Frontier on a large scale and produce WARCs for CommonCrawl.
DigitalPebble/crawlurlfrontier’s past year of commit activity - tika-detector-stormcrawler Public
Wraps the charset detection logic from StormCrawler as a Tika module
DigitalPebble/tika-detector-stormcrawler’s past year of commit activity - tika Public Forked from apache/tika
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
DigitalPebble/tika’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…