-
Notifications
You must be signed in to change notification settings - Fork 3
Software Requirement Specification
- Introduction
- General Description
- Block Diagram
- Software Stack
- Specific System Functions and Requirements
- Functional Requirements
- Interface Requirements
- Assumptions and Dependencies
- Constraints
- User Description
The purpose of the YouTube Video Analyzer project is to create a dual-platform tool, available both as a browser extension and a website, designed to enrich the video viewing/audio listening experience. The application aims to provide comprehensive transcripts, concise summaries, AI Chatbot Assistance, and Sentiment Analysis of YouTube video comments.
At present, navigating YouTube content can be challenging due to the prevalence of lengthy videos that often contain click-bait or filler material. Our project aims to optimize a user's video viewing experience by predicting video quality from sentiment analysis of comments, extracting useful information from highly informative videos, and conversing with a Chatbot about the video.
This document is intended for various stakeholders involved in the development, testing, maintenance, and usage of the YouTube Video Analyzer software.
From a product’s perspective, a YouTube Video Analyzer is a tool designed to condense lengthy video content into concise and digestible summaries, store them as notes, and ask questions through AI assistance, offering users a time-efficient way to extract key information and insights without the need to watch the entire video.
The primary functions of the application include:
- Transcript Generation
- Video Summarizer
- Note-Maker
- Sentiment Analyzer
- Ask AI
- Live Meet Summary and Transcript
The operating environment for the application includes various libraries, frameworks, dependencies, and hardware platforms, as well as third-party integrations. It is compatible with Chrome web browser and various operating systems.
We will be using MongoDB as a NoSQL database. A cloud-based database also facilitates rapid development, as well-established APIs are available for CRUD operations. Storing frequently used and user-saved summaries in the database for faster access.
We will be using ReactJS along with Tailwind and the framer motion library to craft a sleek and dynamic front-end experience for our web application. The extension will be built using JavaScript. The component-based facet of React allows us to develop reusable code that can be used throughout the application.
We will be using NodeJS along with ExpressJS to develop the back-end application. Express.js provides a set of features that make it easy to build web applications using Node.js. It includes a routing system for handling HTTP requests, middleware for handling requests and responses, and templating engines for rendering HTML views. NodeJS also allows us to seamlessly inculcate machine learning models for speech and text interconversion and sentiment analysis.
- Web Interface
- Extension
- Client-Server System
- Database
- Availability
- Performance
- Security
- Scalability
- Usability
- Maintainability
- Reliability
- Robustness
The database will store sensitive information safely by encryption and will be used to store user-specific summaries of videos for a better user experience.
The functional requirements encompass a range of capabilities related to audio and text processing, AI interaction, note-taking, and sentiment analysis. It summarizes the video either using the video transcript or by processing the audio by using machine learning models and, text and speech interconversion libraries.
Interface for interacting with the application will be the web application and the chrome extension. The interface would be primarily useful for users to get a glimpse of a YouTube video/mp3 file/mp4 file/meet without watching/listening/attending it. It will also allow users to make notes for each video, ask doubts from AI, will be designed to be intuitive, easy to use, responsive, work on all devices and compatible with chrome.
There are 4 primary components in the application which are the Web application, an Express server, a non-relational database, and the chrome extension.
The communication interfaces are primarily between the React SPA - Express server, the Express server - MongoDB, and the Express server - ML model.
- Users have a beginner level grasp of English.
- Users understand how to add a browser extension to their browser.
- Users have a reasonably fast internet connection.
- Express server
- TensorFlow/TorchJS for ML models
The major limitations include:
- The application will only be able to analyze YouTube videos.
- The application has to constantly communicate with the API to exchange data.
- The accuracy & coherence of the summarizer and ChatBot depends upon the ML models used.
- The browser extension is only supported on Google Chrome.
Given below is a general description of the various users who would be involved with the application.
- Developers, Maintainers
- Youtube User
Documentation of the entire project would have to be provided to users of the Developers, Maintainers class so that they can actively maintain and supervise the application. Users of the Youtube User class would only have to be given documentation as to how to use the web application and how to use the chrome extension to get all the features of our application.
- General Maintenance, Security patches on dependency updates
- Viewing video summaries/mp3 file/mp4 file, making notes, Asking questions to AskAI about the video, live-meet real-time transcripts and summary