Skip to content

Rivooooo/bdc-2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

BDC 2025 Video Downloader

This project contains a Python script to download videos from Instagram and Google Drive links stored in CSV files.

Features

  • Downloads videos from Instagram reels and Google Drive links
  • Processes CSV files with video IDs and URLs
  • Saves downloaded videos to the data/raw folder with ID-based filenames
  • Comprehensive logging and progress tracking
  • Handles rate limiting and error recovery

Installation

  1. Install the required dependencies:
pip install -r requirements.txt
  1. Make sure you have Python 3.7+ installed.

Usage

Running the Download Script

To download all videos from the CSV files:

python scripts/download_links.py

The script will:

  1. Process both datatrain.csv and datatest.csv files
  2. Download videos from Instagram and Google Drive links
  3. Save files as {id}.mp4 in the data/raw folder
  4. Log progress to both console and download_log.txt

CSV File Structure

The script expects CSV files with the following structure:

  • Column A: Video ID (starting from row 2)
  • Column B: Video URL (starting from row 2)

Example:

id,video
1,https://www.instagram.com/reel/ABC123/
2,https://drive.google.com/file/d/xyz789/view

Output

  • Downloaded videos are saved in data/raw/ folder
  • Files are named as {id}.mp4 (e.g., 1.mp4, 2.mp4)
  • Progress and errors are logged to download_log.txt

Supported Platforms

  • Instagram: Reels and posts (using yt-dlp)
  • Google Drive: Direct file downloads

Error Handling

  • The script continues downloading even if some files fail
  • Failed downloads are logged with detailed error messages
  • Duplicate files are skipped automatically
  • Rate limiting is implemented to avoid being blocked

Notes

  • Instagram downloads may require authentication for private content
  • Google Drive links must be publicly accessible
  • The script includes a 1-second delay between downloads to avoid rate limiting
  • Large files may take time to download depending on your internet connection

Troubleshooting

  1. Instagram download failures: Some Instagram content may be private or require login
  2. Google Drive access denied: Ensure the Google Drive links are publicly accessible
  3. Network errors: Check your internet connection and try again
  4. Permission errors: Ensure you have write permissions to the data/raw folder

Dependencies

  • pandas: CSV file processing
  • requests: HTTP requests for Google Drive downloads
  • yt-dlp: Instagram video downloads
  • tqdm: Progress bars
  • pathlib: File path handling

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors