Skip to content

Architecture Design Document

Dhananjay-Goel edited this page May 8, 2024 · 2 revisions

Architecture Design Document

TABLE OF CONTENTS

  1. INTRODUCTION
    1. Purpose
    2. Scope
    3. Overview
  2. SYSTEM OVERVIEW
    1. Product Overview
    2. Functionality
    3. Context of the project
    4. Design
  3. SYSTEM ARCHITECTURE
    1. Architectural Design
      1. Subsystems
      2. Modules
    2. Decomposition Description
      1. Three-Tier Architecture
    3. Design Rationale
  4. DATA DESIGN
    1. Data Description
    2. Data Dictionary

INTRODUCTION

Purpose

The goal of the YouTube Video Analyzer project is to develop a tool that enhances the YouTube viewing experience. This tool will be accessible as both a browser extension and a website. It aims to offer detailed transcripts, brief summaries, AI Chatbot Assistance, and Sentiment Analysis of YouTube video comments. It also extends to take Mp4 and Mp3 files as input and even live meetings. By leveraging in-depth analytics, users can gain insights into video performance, viewer engagement, and content sentiment, among other metrics.

Scope

Currently, navigating YouTube can be difficult due to long videos with click-bait or filler content. YouTube lacks a feature to summarize videos efficiently, so users can quickly understand the video's content before watching. This problem can be extended to audio, videos, and live meetings. Our project aims to enhance the user's video experience in three ways: predicting video quality from comment sentiment analysis and other analytical tools such as most used words, extracting useful information from informative videos, and engaging with a Chatbot about the video.

Overview

The project aims to enhance the user experience on YouTube by offering a range of functionalities across multiple platforms, including websites and a Chrome extension. These functionalities include generating transcripts, summarizing videos, making notes, analyzing sentiments, and asking questions to an AI. The application provides users with a preview of a video's content, allowing them to grasp its essence without watching the entire video.

SYSTEM OVERVIEW

Product Overview

The application aims to improve YouTube’s user experience while building a multi-platform i.e. website as well as Chrome extension offering advanced tools for both content creators and viewers. The application allows users to get a glimpse of the video without watching it. By leveraging in-depth analytics, users can gain insights into video performance, viewer engagement, and content sentiment, among other metrics. It also extends its features to videos and audio files imported into the application and provides in-depth analysis of them.

Functionality

The application stands out by offering:

  • Comprehensive Transcript Generation: Automatically generates detailed transcripts of video content.
  • Video Summarizer: Provides concise summaries of lengthy videos, using natural language processing (NLP) techniques.
  • Sentiment Analysis: Utilizes machine learning models to gauge viewer sentiment from YouTube video comments, providing creators and new viewers feedback on viewer reception.
  • Interactive Analytics Dashboard: A user-friendly interface displays a range of analytics, such as engagement metrics, sentiment trends, and content summaries.
  • Note-Maker: This enables users to create and save notes directly within the YouTube interface.
  • Ask AI: This gives functionality for asking questions about the content of the video.
  • API Integration: Seamlessly connects with YouTube’s API to fetch video data, ensuring up-to-date analytics and insights.
  • Live Meet Summary and Transcript: Take audio at an interval of 25 seconds periodically and present a summary and transcript simultaneously.

Context of the project

YouTube has become one of the primary platforms for accessing video content on the internet, hosting a vast array of videos ranging from educational tutorials to entertainment and news, presenting challenges for users in efficiently navigating and understanding the content they encounter.

Challenges Faced by Users:

  • Lengthy Videos: Many videos are lengthy, spanning hours of content. Users often find it challenging to invest the time required to watch these videos in their entirety, especially if they are unsure of the relevance or quality of the content.
  • Lack of Summarization: YouTube lacks built-in features for summarizing video content, leaving users to rely on manual scanning or skipping through videos to extract key information.
  • Limited Analytics for viewers: Viewers cannot directly gauge the content of the video nor the reactions and feedback it has received from previous viewers.
  • No direct application to store your video notes: Often, users need to store notes of their videos somewhere else which are bound to get lost. There is no tool to display them on YouTube.
  • Lack of AI assistance: YouTube lacks built-in features to ask questions about the video content.

Condense operates as a dual-platform tool, accessible both as a browser extension and a website. It integrates seamlessly with YouTube, offering users a convenient way to access and interact with video content directly from the platform. The system utilizes various technologies and frameworks to ensure compatibility and functionality across different platforms and browsers.

The project aims to address these challenges by providing a comprehensive toolset that enables users to:

  • Quickly Assess Video Relevance: By generating transcripts and summaries, users can gain insights into video content without having to watch the entire video, allowing them to determine relevance and quality efficiently.
  • Facilitate Active Engagement: Features such as note-making and AI assistance encourage active engagement with video content, promoting deeper understanding and interaction.
  • Provide Audience Insights: Sentiment analysis of video comments offers users insights into audience reactions and feedback, helping them gauge community sentiment and engagement.

Design

  1. Architecture:

The architecture of the YouTube Video Analyzer is designed to be modular and scalable, comprising frontend, backend, and database components. The system operates as a client-server application, with the front end serving as the user interface and the back end handling data processing and business logic.

  • Frontend: The frontend is implemented using ReactJS, a JavaScript library for building user interfaces. It provides an intuitive and responsive interface for users to interact with the system's features. The front end communicates with the backend via RESTful APIs to fetch data and perform actions such as generating transcripts, summarizing videos, and analyzing sentiments.
  • Backend: The backend is built using Node.js with Express.js, a minimalist web framework for Node.js. Express.js facilitates the creation of RESTful APIs for handling HTTP requests and responses. The backend is responsible for processing user requests, executing business logic, and interfacing with external services such as YouTube's API for fetching video data.
  • Database: MongoDB is used as the database management system (DBMS) for storing application data. MongoDB is a NoSQL database that offers flexibility and scalability, making it well-suited for handling unstructured data such as video transcripts, comments, and user notes.

SYSTEM ARCHITECTURE

Architectural Design

The architecture of the YouTube Video Analyzer follows a three-tier model, comprising presentation, application, and data tiers. Each tier serves a specific function within the system and is designed to be modular and scalable.

Subsystems

  1. Frontend Subsystem:

    • Components: React components for UI elements.
    • Functionality: User interaction, data presentation.
    • Technology: ReactJS, HTML, CSS, JavaScript.
  2. Backend Subsystem:

    • Components: Node.js modules for request handling, business logic.
    • Functionality: Data processing, business logic execution.
    • Technology: Node.js, Express.js, JavaScript.
  3. Database Subsystem:

    • Components: MongoDB collections for data storage.
    • Functionality: Data storage, retrieval, manipulation.
    • Technology: MongoDB, Mongoose (Node.js ORM for MongoDB).

Modules

  1. User Interface (UI):

    • Description: The UI module handles user interaction and presentation of data. It comprises React components for rendering UI elements such as buttons, input fields, and data visualizations.
    • Responsibilities: User input validation, data presentation, event handling.
    • Technology: ReactJS, HTML, CSS, JavaScript.
  2. API Service:

    • Description: The API service module provides endpoints for handling HTTP requests from the frontend. It receives requests, executes business logic, and returns responses containing processed data or error messages.
    • Responsibilities: Request handling, business logic execution, response generation.
    • Technology: Node.js, Express.js, JavaScript.
  3. Data Access Layer (DAL):

    • Description: The DAL module facilitates interaction with the database, including data storage, retrieval, and manipulation operations. It provides an abstraction layer for performing CRUD (Create, Read, Update, Delete) operations on application data.
    • Responsibilities: Data storage, retrieval, manipulation.
    • Technology: MongoDB, Mongoose (Node.js ORM for MongoDB), JavaScript.

Decomposition Description

Three-Tier Architecture

  1. Presentation Tier:

    • Description: The presentation tier, also known as the frontend, is responsible for handling user interaction and presenting data to the user. It comprises React components for rendering UI elements and communicating with the backend via RESTful APIs.
    • Components: React components for UI elements.
    • Technology: ReactJS, HTML, CSS, JavaScript.
  2. Application Tier:

    • Description: The application tier, also known as the backend, is responsible for processing user requests, executing business logic, and interfacing with external services such as YouTube's API. It provides RESTful APIs for handling HTTP requests from the frontend.
    • Components: Node.js modules for request handling, business logic.
    • Technology: Node.js, Express.js, JavaScript.
  3. Data Tier:

    • Description: The data tier, also known as the database, is responsible for storing application data such as video transcripts, comments, and user notes. It uses MongoDB as the database management system (DBMS) for storing and retrieving data.
    • Components: MongoDB collections for data storage.
    • Technology: MongoDB, Mongoose (Node.js ORM for MongoDB).

Design Rationale

The three-tier architecture was chosen for its modularity, scalability, and maintainability. Separating the system into presentation, application, and data tiers allows for independent development and deployment of each tier, enabling easier maintenance and updates. Additionally, the use of RESTful APIs for communication between the frontend and backend promotes interoperability and flexibility, allowing for future integration with other systems and services.

DATA DESIGN

Data Description

The YouTube Video Analyzer application generates and stores various types of data, including video transcripts, comments, user notes, and analytics metrics. These data are stored in a MongoDB database, which offers flexibility and scalability for handling unstructured data.

Data Storage Items

  1. Video Transcripts:

    • Description: Textual representations of video content, generated through automated transcription processes.
    • Attributes: Video ID, Transcript text, Timestamps.
  2. Comments:

    • Description: User comments associated with videos, fetched from YouTube's API.
    • Attributes: Video ID, Comment text, User ID, Timestamp, Sentiment score.
  3. User Notes:

    • Description: User-generated notes associated with videos, stored for personal reference.
    • Attributes: Video ID, Note text, Timestamp.
  4. Analytics Metrics:

    • Description: Various analytics metrics such as engagement, sentiment trends, and content summaries.
    • Attributes: Video ID, Metric type, Metric value.

Data Dictionary

System Entities and Major Data

  1. Video:

    • Attributes: ID, Title, Description, Duration, Views, Likes, Dislikes, Comments.
  2. Transcript:

    • Attributes: Video ID, Transcript Text, Timestamps.
  3. Comment:

    • Attributes: Video ID, Comment Text, User ID, Timestamp, Sentiment Score.
  4. User Note:

    • Attributes: Video ID, Note Text, Timestamp.
  5. Analytics Metric:

    • Attributes: Video ID, Metric Type, Metric Value.

HUMAN INTERFACE DESIGN

The YouTube Video Analyzer provides a user-friendly interface that empowers users to efficiently navigate and interact with the system's features. From the user's perspective, the system offers a range of functionalities aimed at enhancing the YouTube viewing experience. Below is an overview of how users will be able to use the system to complete expected features and the feedback information provided:

Accessing the System:

  • Users can access the YouTube Video Analyzer through two primary channels: the web application and the browser extension.
  • For the web application, users simply need to navigate to the designated website and log in with their credentials.
  • For the browser extension, users can install it from the Chrome Web Store and enable it in their browser.

Generating Transcripts and Summaries:

  • After entering the video URL or attaching mp3/mp4 file, users can initiate the process of generating transcripts and summaries by clicking on the respective buttons.
  • The system will then analyze the video content and provide a text transcript as well as a concise summary, which users can review and utilize for quick understanding of the video's content.

Making Notes:

  • Users have the option to make notes about the video directly within the summary page.
  • They can add notes, comments, or annotations to specific timestamps or sections of the video, providing a convenient way to capture important information or insights.

Analyzing Sentiment:

  • The system performs sentiment analysis on the comments and interactions associated with the video.
  • Users can view the sentiment analysis results, which provide insights into the overall sentiment of the video's audience, including positive, negative, or neutral reactions.

Interacting with AI Chatbot:

  • Users can engage with an AI-powered Chatbot to ask questions about the video content.
  • The Chatbot utilizes machine learning models to provide relevant answers and insights, enhancing users' understanding and engagement with the video.

Overview of User Interface

From the user's perspective, the project provides a comprehensive platform for summarizing YouTube videos, MP4 videos, MP3 audio, and Live meetings, along with other features. Here's how they can use the system:

  • Landing Page: Introduces the website and its features.
  • MyNotesAndSummaries Page: Displays pre-existing summaries of YouTube videos for demo purposes so that the user gets familiar with the interface.
  • Login/Signup Page: Allows users to log in to an existing account or create a new one.
  • Dashboard: Provides a user interface for uploading YouTube links, videos, and audio or start/stop recording audio from live meet for summary generation.
  • Analytics Page: Provides User details about the sentiment analysis of the Video and tells the ratio of positive and negative comments of the video.
  • Contact Us Page: Allows users to send inquiries or feedback to the website administrators.
  • Feedback Page: Enables users to provide feedback about the website's features and functionality.
  • My Notes Page: Displays saved summaries and notes.
  • Profile Page: Allows users to view and update their basic information and manage favorite summaries.
  • Chrome Extension: Integrated into the YouTube page. Shows transcript and summary of the currently playing video. Provides a note-making feature for the current video. Includes an "Ask AI" feature to ask questions about the video.

The feedback information displayed to the user includes:

  • Confirmation messages for successful actions (e.g., uploading a video, saving a note).
  • Error messages for unsuccessful actions (e.g., invalid login credentials, failed video upload).
  • Suggestions for improving the user experience based on their interactions with the platform.
  • Notifications about new features or updates to the platform.

Screen Images

Screen Image

Screen Objects and Actions

In the YouTube Video Analyzer interface, various screen objects facilitate user interaction with the system's features. Here's a discussion of key screen objects and the actions associated with them:

  • Input Field: A text input field where users can enter the URL of the YouTube video or attach mp3/mp4 file they wish to analyze.
  • Buttons: Buttons labeled with descriptive text or icons representing different functionalities of the system.
  • Transcript Section: A section of the interface dedicated to displaying the transcript of the analyzed YouTube video/mp3/mp4 file/live meet audio file.
  • Summary Section: A section of the interface where a concise summary of the analyzed YouTube video/mp3/mp4 file/live meet audio file is displayed.
  • Notes Section: A section of the interface where users can create and view notes about the analyzed YouTube video/mp3/mp4 file/live meet audio file.
  • Sentiment Analysis Section: A section of the interface displaying the results of sentiment analysis performed on the comments and interactions associated with the analyzed YouTube video/mp3/mp4 file/live meet audio file.
  • Chatbot Interaction Section: A section of the interface where users can interact with an AI-powered Chatbot to ask questions about the analyzed YouTube video/mp3/mp4 file.
  • Feedback Messages: Messages or notifications displayed to users to provide feedback on their actions or the status of the system.
  • ContactUs Messages: Messages or notifications displayed to users to provide inquiry on their actions or the status of the system.

Overall, these screen objects and actions facilitate user interaction with the YouTube Video Analyzer interface, allowing users to access and utilize the system's features effectively and efficiently.

REQUIREMENTS MATRIX

The cross-reference table below aligns the specific system components and data structures with the functional requirements outlined in the SRS document. This tabular format helps in tracing how each component contributes to fulfilling the project's requirements.

Requirement ID Requirement Description System Component/Data Structure Component Description
2.2.1 Transcript Generation Transcript Generation Service Automatically generates transcripts from YouTube videos/mp3 file/mp4 file/live meet audio using speech-to-text models. Integrates with the video processing microservice and stores transcripts in the database for quick retrieval and summarization.
2.2.2 Video Summarizer Video Summarization Service Utilizes NLP and machine learning models to provide concise summaries of video content. Works on extracted transcripts and user-defined notes, summarizing key points for easy consumption.
2.2.3 Note-Maker Note Management Component Enables users to create, save, and manage notes within the application interface. Linked with user accounts and stored securely in the database, facilitating personalized content management.
2.2.4 Sentiment Analyzer Sentiment Analysis Service Analyzes comments and viewer interactions to assess sentiment, leveraging machine learning techniques. Results are used to enhance video recommendations and provide creators with feedback.
2.2.5 Ask AI AI Chatbot Service Provides an interactive AI chatbot that answers questions regarding video content. Integrates with the summarization and transcript services to fetch relevant information, enhancing user engagement.
4.2 Frontend Development Web Application (ReactJS, Tailwind CSS) Crafts the user interface for the web application, providing an engaging and responsive experience. Incorporates authentication and connects to backend services for dynamic content updates.
4.3 Backend Development Express Server (Node.js) Handles business logic, API requests, and server-client communication. Orchestrates data flow between the frontend, machine learning models, and the database.
5.1 Web Interface & Extension Web Interface; Chrome Extension The web interface allows for video URL input and displays analytics, while the Chrome extension integrates directly with YouTube for in-platform functionality. Both interfaces provide access to summarization, note-taking, and sentiment analysis features.
5.3 Client Server Client-Server Architecture
5.4 Database Usage MongoDB Database Stores user data, video summaries, notes, and analysis results. Structured for fast access and high availability to support the application's data needs.
7.1 User Interfaces UI for Web Application and Chrome Extension Both interfaces are designed for ease of use, accessibility, and functionality across devices and browsers. They provide direct access to the application's features, enhancing the YouTube viewing experience.
7.2 Software Interfaces API Interactions between Components Define the communication protocols between the frontend, backend, and external services (e.g., YouTube API, machine learning models). Ensures seamless data exchange and functionality integration.
7.3 Communication Interfaces REST APIs, MongoDB, TensorFlow.js Facilitate communication between the application's components. REST APIs connect the frontend to backend services. MongoDB integration enables data storage and retrieval. TensorFlow.js is used for running machine learning models in the backend.