GlitchAI is a real-time speech-to-speech voice assistant designed to revolutionize your interactions with technology. With its advanced natural language processing capabilities and intuitive voice interface, GlitchAI understands your spoken requests and responds in a clear, human-like voice.
- Speech Recognition : Accurate and fast conversion of spoken words to text.
- Text-to-Speech: Natural and clear speech generation from text.
- Contextual Awareness: Understanding of conversation context for tailored responses.
- Entertainment: Storytelling, jokes, poetry, content - recommendations, trip planning.
- Productivity: Task management and reminders.
- Speech-to-Text: OpenAI's Whisper base English model
- Language Model: Gemini 1.5 Flash integrated with LangChain
- Local Deployment: Ngrok
- Text-to-Speech: Edge-TTS
- Programming Language: Python
- Python 3.10.x
- Required libraries (install using pip install -r requirements.txt)
git clone https://github.com/nilotpal-basu/GlitchAI.gitGOOGLE_GEMINI_KEY: Your Google Gemini API key
NGROK_AUTHTOKEN: Your Ngrok authtoken
- Open Google Colab: Go to https://colab.google/.
- Upload the file: Upload the
llm_server.ipynbfile to Google Colab. - Run the cells: Execute the cells in the notebook, following the instructions provided.
- Obtain Ngrok URL: The notebook will output the Ngrok URL once the tunnel is established.
- Copy that URL and paste it in
main.pyin place ofyour_public_ngrok_url.
python main.pySpeak into your microphone to interact with GlicthAI. The assistant will respond verbally and in text.
Contributions are welcome! Please follow these guidelines:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them.
- Submit a pull request to the main branch.
This project is licensed under the MIT License.
