👻 Specter: From Brainstorm to Code

📍 EPFL – LauzHack Hackaton 2025 👥 Team: Matthias Wyss, Yassine Wahidy, William Jallot, Lina Sadgal
🔗 GitHub Repository: GitHub

Specter is a web application designed to bridge the gap between ideation and implementation. Developed during the LauzHack 2025 hackathon, it transforms raw brainstorming materials—specifically whiteboard photos and audio explanations—into a complete, runnable project scaffold in seconds.

By leveraging a multi-agent pipeline, Specter automates the tedious phase of manual requirement extraction and initial system design, allowing developers to focus purely on high-level creativity.

🔍 How It Works

Multimodal Input Capture: High-quality capture of sketches and vocal constraints using professional hardware (Logitech MX Brio).
Multi-Agent Orchestration: Utilizing Oracle’s Agent Spec and WayFlow, the system coordinates specialized agents:
- Vision Agent: Interprets complex diagrams and handwritten notes.
- Audio & Semantic Agent: Transcribes and extracts key constraints from speech.
- Architect & Coder Agents: Generate modular, idiomatic backend/frontend code and system design.
AI Reasoning: Powered by Gemini (LLM + Vision) to perform cross-modal reasoning, ensuring the generated code aligns perfectly with the visual sketch and verbal instructions.

🚀 Key Features

Instant Prototyping: Generates structured project specifications and architecture proposals from a single photo.
Modular Codebase: Produces production-ready, functional files rather than just snippets.
Automated Documentation: Tailors developer notes and deployment instructions to the specific project structure identified.
Deployment Ready: Includes optional scaffolding for rapid deployment right after the brainstorming session.

🛠 Tools & Libraries:

Gemini (Pro & Vision): The core engine for multimodal understanding and reasoning.
Oracle Agent Spec & WayFlow: Frameworks used to manage agent communication and state flow.
Backend: Python-based API for pipeline orchestration.
Frontend: Responsive web interface for rapid interaction and real-time output visualization.

🧠 Techniques:

Multi-Agent Systems (MAS)
Computer Vision (Handwriting & Diagram Recognition)
Natural Language Processing (Semantic Extraction)
Automated Software Engineering (ASE)