Hi,
I’ve recently started playing around with this library and wanted to use a customizable pipeline to update several symbols. However, during the implementation, I ran into a few issues and unexpected behaviors with the following configuration:
log_level: TRACE
parallel_downloads: 4
# Storage backend
storage:
strategy: hive
path: ./marketdata/
compression: zstd
partition_granularity: year
atomic_writes: true
enable_locking: true
metadata_tracking: true
# Data providers
providers:
yahoo:
type: yahoo
enabled: true
rate_limit:
requests_per_second: 10.0
retry_max_attempts: 3
timeout: 30
# Datasets
datasets:
taa:
symbols: [SPY, IWM, QQQ, VGK, EWJ, EEM,VNQ, RWX, TLT, IEF, SHY, TIP, LQD, HYG, BWX, EMB, GSG, GLD, DBC, BIL, VEA, VWO]
provider: yahoo
frequency: daily
asset_class: equity
update_mode: backfill
validation_enabled: true
anomaly_detection: true
Initial Load Limit: The initial load seems to be strictly limited to a maximum of 365 days. I wasn't able to fetch a longer historical dataset out of the box.
Inconsistent Storage Format: The data format saved in the storage layer appears to be inconsistent with the doc
/yahoo_daily_BWX
./yahoo_daily_BWX/year=2020
./yahoo_daily_BWX/year=2020/month=7
./yahoo_daily_BWX/year=2020/month=9
./yahoo_daily_BWX/year=2020/month=8
./yahoo_daily_BWX/year=2020/month=1
./yahoo_daily_BWX/year=2020/month=6
./yahoo_daily_BWX/year=2020/month=11
./yahoo_daily_BWX/year=2020/month=10
./yahoo_daily_BWX/year=2020/month=3
./yahoo_daily_BWX/year=2020/month=4
./yahoo_daily_BWX/year=2020/month=5
./yahoo_daily_BWX/year=2020/month=2
./yahoo_daily_BWX/year=2020/month=12
./yahoo_daily_BWX/year=2018
./yahoo_daily_BWX/year=2018/month=7
./yahoo_daily_BWX/year=2018/month=9
./yahoo_daily_BWX/year=2018/month=8
./yahoo_daily_BWX/year=2018/month=1
./yahoo_daily_BWX/year=2018/month=6
./yahoo_daily_BWX/year=2018/month=11
./yahoo_daily_BWX/year=2018/month=10
./yahoo_daily_BWX/year=2018/month=3
./yahoo_daily_BWX/year=2018/month=4
./yahoo_daily_BWX/year=2018/month=5
./yahoo_daily_BWX/year=2018/month=2
./yahoo_daily_BWX/year=2018/month=12
Given these points, I wanted to ask about the current status and roadmap for this project:
-
Is it considered stable enough for production/active use at this stage, or is it still in an experimental phase?
-
How are bug fixes and feature requests currently being handled (e.g., should I submit PRs for these specific issues)?
Thanks for building this, and looking forward to your insights!
Hi,
I’ve recently started playing around with this library and wanted to use a customizable pipeline to update several symbols. However, during the implementation, I ran into a few issues and unexpected behaviors with the following configuration:
Initial Load Limit: The initial load seems to be strictly limited to a maximum of 365 days. I wasn't able to fetch a longer historical dataset out of the box.
Inconsistent Storage Format: The data format saved in the storage layer appears to be inconsistent with the doc
Given these points, I wanted to ask about the current status and roadmap for this project:
Is it considered stable enough for production/active use at this stage, or is it still in an experimental phase?
How are bug fixes and feature requests currently being handled (e.g., should I submit PRs for these specific issues)?
Thanks for building this, and looking forward to your insights!