Files
app/README.md
Albert f60d61a78f feat: implement data layer with comprehensive test infrastructure
- Define SQLModel schemas for Session, Note, and Link entities
  - Add API request/response models for RPC endpoints
  - Create LLM structured output models for Zettel extraction
  - Set up async database initialization with SQLModel and aiosqlite
  - Implement repository pattern for CRUD operations
  - Add complete test suite with pytest configuration
  - Create validation test runner for development workflow
  - Add .gitignore for Python/FastAPI project security
2025-08-17 01:25:16 +00:00

141 lines
4.2 KiB
Markdown

# SkyTalk API (api.skytalk.app)
SkyTalk is a backend service designed to power an "AI for Thought" application.
It functions as an intelligent, conversational agent that interviews a user to
explore and solidify their nascent ideas. The conversation transcript is then
synthesized into a structured, semantically-linked knowledge base, styled after
the Zettelkasten method.
This MVP is an API-only service, providing the core engine for elicitation,
synthesis, and connection, without a frontend interface.
---
## Architecture
The system is designed as a monolithic Python application using FastAPI, built
for asynchronous performance. It follows a clean, three-layer architecture: API,
Services, and Data.
```d2
# SkyTalk API Architecture (MVP)
direction: down
# External APIs
ExternalAPIs: {
shape: cloud
style.fill: "#DB4437"
Gemini: "Google Gemini API (2.5 Pro/Flash & Embeddings)"
}
# Main Application Container
SkyTalkAPI: {
shape: rectangle
style.fill: "#E3F2FD"
direction: right
# API Layer
API: {
shape: package
label: "API Layer (FastAPI)"
Endpoints: "RPC Endpoints"
BackgroundTasks: "Background Tasks"
}
# Services Layer (Orchestration and Business Logic)
Services: {
shape: package
SessionService: "Session Orchestration"
InterviewerAgent: "Interviewer Agent (RAG)"
SynthesizerAgent: "Synthesizer Agent"
VectorService: "Vector Service"
}
# Data Layer
Data: {
shape: package
label: "Data Layer (Async)"
Repositories: "Session/Note Repositories"
SQLite: {
shape: database
label: "Metadata (SQLite/SQLModel)"
}
ChromaDB: {
shape: database
label: "Vector Store (ChromaDB)"
}
}
# Connections
API.Endpoints -> Services.SessionService
API.BackgroundTasks -> Services.SessionService: "Trigger Synthesis"
Services.SessionService -> Services.InterviewerAgent
Services.SessionService -> Services.SynthesizerAgent
Services.InterviewerAgent -> Services.VectorService: "RAG Retrieval"
Services.SynthesizerAgent -> Services.VectorService: "Indexing/Neighbors"
Services.SessionService -> Data.Repositories
Services.VectorService -> Data.ChromaDB
Data.Repositories -> Data.SQLite
}
# Connections to External APIs
SkyTalkAPI.Services -> ExternalAPIs.Gemini
```
---
## MVP Features (Phase 1)
At the completion of this implementation plan, the SkyTalk API will support the
following core features:
- **Session Management:** Start a new interview session based on an initial
topic.
- **RAG-Powered Interviewing:** Engage in a back-and-forth conversation where
the AI's questions are informed by existing knowledge in the vector store.
- **Automatic Session Termination:** The AI can detect a natural conclusion to
the conversation.
- **Asynchronous Synthesis:** Once the interview ends, a background process is
triggered to analyze the transcript.
- **Semantic Segmentation:** The transcript is intelligently broken down into
atomic "Zettels" (notes), each focusing on a single concept.
- **Vector Indexing:** Each new note is converted into a vector embedding and
stored for future RAG.
- **Generative Linking:** The system identifies semantically related notes and
uses an LLM to generate a rich, contextual link explaining the relationship
between them.
- **Status Tracking:** Endpoints to check the status of a session (active,
processing, completed).
---
## Next Steps (Post-MVP)
- **Authentication:** Implement user accounts and authentication (e.g., JWT) to
create user-specific knowledge bases.
- **Frontend Integration:** Build a web-based frontend (e.g., at
www.skytalk.app) that consumes this API.
- **Knowledge Graph Visualization:** Add endpoints to export the note-and-link
structure in a format suitable for graph visualization libraries (e.g., D3.js,
Vis.js).
- **Note Editing and Management:** Provide endpoints for users to manually edit,
delete, or merge notes.
- **Advanced Search:** Implement more sophisticated search functionalities
beyond simple semantic search, such as filtering by tags or searching within
link contexts.
- **Scalable Infrastructure:** Migrate from SQLite/embedded ChromaDB to a
production-grade database (e.g., PostgreSQL with pgvector) and a managed
vector database for scalability.