CodeCanvas is a definitive solution for converting visual flowchart logic into executable code, distinguished by its innovative Structured Logic Compiler approach that goes far beyond generic image-to-text translation. Our core differentiation is the elimination of ambiguity in the flowchart's flow, providing Structural Certainty vs. Guesswork. Instead of guessing the order based on visual proximity, CodeCanvas uses a custom YOLOv8 model to identify the structural grammar—the shapes and the arrows. This structure is compiled into a formal JSON Graph (Nodes & Edges), explicitly defining connections like "Node 5 connects to Node 7," which guarantees the final code follows the intended logic. We ensure Robust, High-Accuracy Multimodal Grounding by tackling messy handwriting directly. We run a hybrid vision pipeline that pairs Customized EasyOCR for optimal text extraction with the Gemini 2.5 Flash model, which acts as an AI Proofreader to autonomously correct potential OCR typos (e.g., fixing "Leave hom" to "Leave home") and interpret complex mathematical symbols using its vision. Finally, CodeCanvas delivers a Complete, Deployable Engineering Product. This includes Instant Language Translation in real-time, allowing users to dynamically switch the output code between Python, Java, C++, and C without rerunning the heavy analysis. Furthermore, the integrated Code Assistant Chat provides Interactive Refinement, guiding users to modify the code (e.g., "add error handling" or "optimize the loop"), transforming the tool into an interactive, guided learning experience. CodeCanvas doesn't merely translate; it digitizes logic, autonomously corrects errors, and guides the user to a production-ready solution, establishing itself as a necessary educational and prototyping platform.
| Package/Language | Version/Link |
|---|---|
| Python | 3.11.x |
| Streamlit | 1.27.x |
| ultralytics (YOLOv8) | Any recent version |
| easyocr | Any recent version |
| google-genai | Any recent version |
| opencv-python | Any recent version |
git clone https://github.com/AAC-Open-Source-Pool/25AACL03 cd 25AACL03
pip install -r requirements.txt
Run the main entry point to start the Streamlit application: streamlit run Home.py
Team Number:
25AACL03
Senior Mentor:Meghana
Junior Mentor:Lahari
Team Member 1:S L P Srinishpa Gandhalu
Team Member 2:Varnika Mishra
This section provides instructions and details on how to submit a contribution via a pull request. It is important to follow these guidelines to make sure your pull request is accepted.
- Before choosing to propose changes to this project, it is advisable to go through the readme.md file of the project to get the philosophy and the motive that went behind this project. The pull request should align with the philosophy and the motive of the original poster of this project.
- To add your changes, make sure that the programming language in which you are proposing the changes should be the same as the programming language that has been used in the project. The versions of the programming language and the libraries(if any) used should also match with the original code.
- Write a documentation on the changes that you are proposing. The documentation should include the problems you have noticed in the code(if any), the changes you would like to propose, the reason for these changes, and sample test cases. Remember that the topics in the documentation are strictly not limited to the topics aforementioned, but are just an inclusion.
- Submit a pull request via Git etiquettes
- YOLO Model Generalization: Expand the custom YOLO dataset to include highly variable inputs, such as complex intersections, rough hand-drawn shapes, and flowcharts with varying line thicknesses.
- Graph Topology Validation: Implement explicit graph-theory algorithms (e.g., cycle detection, connectivity checks) to confirm the flow logic derived from the arrows before passing it to the LLM.
- Improved OCR Error Handling: Introduce conditional image processing (e.g., dynamic contrast adjustment) specifically on failed OCR regions to enhance text extraction from poor-quality images.
- Code Assistant Features: Add context-aware capabilities to the chat assistant, such as suggesting unit tests based on the extracted flowchart logic.
- Webcam Stabilization: Enhance the live camera input feature to include image stabilization or capture multiple frames to improve clarity for detection and OCR.
