A domain specific chatbot based on a finetuned LLM and RAG, designed to answer questions about marine knowledge. This open source project finetunes large language models on marine biology, oceanography, and ocean ecosystems. It follows an iterative development approach, with each version improving the model through better techniques and refinements.
Built from open source marine content:
- Wikipedia (CC BY-SA 4.0) — marine biology & oceanography articles
Marine science knowledge is often scattered across many sources and can be difficult to access, oceanAI brings it together and makes it easy to explore through conversation 🌍
Code: MIT — Dataset: CC BY-SA 4.0