Skip to content

ShriyashP/Automated-Data-Pipeline-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Automated Data Pipeline: API to SQL Database 📄 Project Overview This project is an automated data pipeline designed to fetch, clean, and transfer data seamlessly from the Instagram API to an SQL database. The pipeline allows for real-time data updates, providing a solid foundation for monitoring social media metrics and analyzing trends efficiently.

🛠️ Features Automated Data Retrieval: Regularly pulls data from Instagram’s API. Data Cleaning: Formats and sanitizes data for consistency. SQL Integration: Inserts or updates data in a SQL database. Scheduling: Automates data pulls using scheduling tools like cron or task schedulers. Real-Time Data: Provides continuous updates to keep insights current.

🚀 Installation Prerequisites Python 3.8+ SQL Database (e.g., PostgreSQL, MySQL, SQLite) Instagram API Access: Ensure you have a valid API key. Step-by-Step Setup Clone the Repository:

bash Copy code git clone https://github.com/yourusername/data-pipeline.git cd data-pipeline Install Dependencies:

bash Copy code pip install -r requirements.txt Set Up Environment Variables: Create a .env file in the project directory with the following:

plaintext Copy code API_KEY=your_instagram_api_key DB_CONNECTION_STRING=your_database_connection_string 📝 Usage Run the Pipeline Script:

bash Copy code python pipeline.py Scheduling:

Linux: Use cron jobs to run the script at specified intervals. Windows: Use Task Scheduler to automate the execution. Querying Data: Access the data directly from your SQL database using SQL queries or your preferred database client.

⚙️ Configuration Database Connection Modify the database settings in the .env file to fit your SQL database specifications.

Scheduling Configuration Set up the scheduling intervals according to your data freshness requirements.

🚧 Future Enhancements Data Visualization: Integrate tools like Tableau or Power BI for direct data visualization. Error Handling: Add more robust error handling and logging for API requests. Data Transformation: Expand data cleaning and transformation steps for advanced analytics.

📜 License This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages