๐ป Specter: From Brainstorm to Code
๐ EPFL โ LauzHack Hackaton 2025
๐ฅ Team: Matthias Wyss, Yassine Wahidy, William Jallot, Lina Sadgal
๐ GitHub Repository: GitHub
Specter is a web application designed to bridge the gap between ideation and implementation. Developed during the LauzHack 2025 hackathon, it transforms raw brainstorming materialsโspecifically whiteboard photos and audio explanationsโinto a complete, runnable project scaffold in seconds.
By leveraging a multi-agent pipeline, Specter automates the tedious phase of manual requirement extraction and initial system design, allowing developers to focus purely on high-level creativity.
๐ How It Works
- Multimodal Input Capture: High-quality capture of sketches and vocal constraints using professional hardware (Logitech MX Brio).
- Multi-Agent Orchestration: Utilizing Oracleโs Agent Spec and WayFlow, the system coordinates specialized agents:
- Vision Agent: Interprets complex diagrams and handwritten notes.
- Audio & Semantic Agent: Transcribes and extracts key constraints from speech.
- Architect & Coder Agents: Generate modular, idiomatic backend/frontend code and system design.
- AI Reasoning: Powered by Gemini (LLM + Vision) to perform cross-modal reasoning, ensuring the generated code aligns perfectly with the visual sketch and verbal instructions.
๐ Key Features
- Instant Prototyping: Generates structured project specifications and architecture proposals from a single photo.
- Modular Codebase: Produces production-ready, functional files rather than just snippets.
- Automated Documentation: Tailors developer notes and deployment instructions to the specific project structure identified.
- Deployment Ready: Includes optional scaffolding for rapid deployment right after the brainstorming session.
๐ Tools & Libraries:
- Gemini (Pro & Vision): The core engine for multimodal understanding and reasoning.
- Oracle Agent Spec & WayFlow: Frameworks used to manage agent communication and state flow.
- Backend: Python-based API for pipeline orchestration.
- Frontend: Responsive web interface for rapid interaction and real-time output visualization.
๐ง Techniques:
- Multi-Agent Systems (MAS)
- Computer Vision (Handwriting & Diagram Recognition)
- Natural Language Processing (Semantic Extraction)
- Automated Software Engineering (ASE)