Next Steps

Task 7: Roadmap to Future improvements to the application.

Keeping Dense Vector Retrieval?

Yes, with enhancements. The current dense vector retrieval withtext-embedding-3-small and Qdrant provides a strong baseline for our business knowledge use case. The documents are domain-specific and well-structured, so dense embeddings capture semantic similarity effectively. However, for Demo Day we plan to add a hybrid retrieval layer that combines dense vectors with BM25 sparse retrieval to handle exact-match queries (e.g., specific financial terms, product names) that pure semantic search can miss.

Future Roadmap

Hybrid Retrieval — Add BM25 sparse retrieval alongside dense vectors using Qdrant's built-in hybrid search. This will improve recall for exact-match queries and financial terminology lookups.
Persistent Vector Store — Move from in-memory Qdrant to hosted Qdrant Cloud for persistence across deployments and larger knowledge bases.
All 7 Sub-Agents — Complete implementation of Sensitivity Analysis, What-If, and Forecast Projection agents with full playbooks and tool sets.
Streaming Responses — Wire up the chat_stream function to deliver token-by-token responses in the UI for a more responsive experience.
Multi-Model Support — Allow connecting multiple Google Sheets financial models simultaneously for cross-brand analysis.
Expanded Knowledge Base — Ingest additional domain documents including TikTok Shop guides, retail expansion playbooks, and CPG-specific financial benchmarks.
Evaluation CI Pipeline — Integrate RAGAS evaluations into CI/CD so every code change is automatically tested against the golden dataset.
User Authentication — Add user accounts with per-user memory and conversation history persistence.

Architecture Evolution

The current architecture is designed to be modular. Each sub-agent is a self-contained node in the LangGraph with its own tools, playbook, and memory access. Adding new agents requires only: (1) defining the playbook prompt, (2) binding the tools, (3) adding the node and edges to the graph, and (4) updating the supervisor's routing logic.

For Future improvements, the biggest structural change will be moving from Vercel serverless functions (which have execution time limits) to a dedicated backend service for long-running goal-seek optimizations that may take 30+ seconds.