feat: Improve UI layout and navigation
- Increase logo size (48x48 desktop, 56x56 mobile) for better visibility - Add logo as favicon - Add logo to mobile header - Move user menu to navigation bars (sidebar on desktop, bottom bar on mobile) - Fix desktop chat layout - container structure prevents voice controls cutoff - Fix mobile bottom bar - use icon-only ActionIcons instead of truncated text buttons - Hide Create Node/New Conversation buttons on mobile to save header space - Make fixed header and voice controls work properly with containers 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
144
docs/voice-mode-implementation-plan.md
Normal file
144
docs/voice-mode-implementation-plan.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Voice Mode Implementation Plan
|
||||
|
||||
## Phase 1: Clean State Machine
|
||||
|
||||
### Step 1: Rewrite state machine definition
|
||||
- Remove all unnecessary complexity
|
||||
- Clear state hierarchy
|
||||
- Simple event handlers
|
||||
- Proper tags on all states
|
||||
|
||||
### Step 2: Add test buttons to UI
|
||||
- Button: "Skip to Listening" - sends START_LISTENING
|
||||
- Button: "Simulate User Speech" - sends USER_STARTED_SPEAKING
|
||||
- Button: "Simulate Silence" - sends SILENCE_TIMEOUT
|
||||
- Button: "Simulate AI Response" - sends AI_RESPONSE_READY with test data
|
||||
- Button: "Skip Audio" - sends SKIP_AUDIO (already exists)
|
||||
- Display: Current state value and tags
|
||||
|
||||
## Phase 2: Fix Processing Logic
|
||||
|
||||
### Problem Analysis
|
||||
Current issue: The processing effect is too complex and uses refs incorrectly.
|
||||
|
||||
### Solution
|
||||
**Simple rule**: In processing state, check messages array:
|
||||
1. If last message is NOT user with our transcript → submit
|
||||
2. If last message IS user with our transcript AND second-to-last is assistant → play that assistant message
|
||||
3. Otherwise → wait
|
||||
|
||||
**Implementation**:
|
||||
```typescript
|
||||
useEffect(() => {
|
||||
if (!state.hasTag('processing')) return;
|
||||
if (status !== 'ready') return;
|
||||
|
||||
const transcript = state.context.transcript;
|
||||
if (!transcript) return;
|
||||
|
||||
// Check last 2 messages
|
||||
const lastMsg = messages[messages.length - 1];
|
||||
const secondLastMsg = messages[messages.length - 2];
|
||||
|
||||
// Case 1: Need to submit user message
|
||||
if (!lastMsg || lastMsg.role !== 'user' || getText(lastMsg) !== transcript) {
|
||||
submitUserInput();
|
||||
return;
|
||||
}
|
||||
|
||||
// Case 2: User message submitted, check for AI response
|
||||
if (secondLastMsg && secondLastMsg.role === 'assistant') {
|
||||
const aiMsg = secondLastMsg;
|
||||
|
||||
// Only play if we haven't played this exact message in this session
|
||||
if (state.context.lastSpokenMessageId !== aiMsg.id) {
|
||||
const text = getText(aiMsg);
|
||||
send({ type: 'AI_RESPONSE_READY', messageId: aiMsg.id, text });
|
||||
playAudio(text, aiMsg.id);
|
||||
}
|
||||
}
|
||||
// Otherwise, still waiting for AI response
|
||||
}, [messages, state, status]);
|
||||
```
|
||||
|
||||
No refs needed! Just check the messages array directly.
|
||||
|
||||
## Phase 3: Clean Audio Management
|
||||
|
||||
### Step 1: Simplify audio cancellation
|
||||
- Keep shouldCancelAudioRef
|
||||
- Call stopAllAudio() when leaving canSkipAudio states
|
||||
- playAudio() checks cancel flag at each await
|
||||
|
||||
### Step 2: Effect cleanup
|
||||
- Remove submittingTranscriptRef completely
|
||||
- Remove the "reset ref when leaving processing" effect
|
||||
- Rely only on messages array state
|
||||
|
||||
## Phase 4: Testing with Playwright
|
||||
|
||||
### Test Script
|
||||
```typescript
|
||||
test('Voice mode conversation flow', async (agent) => {
|
||||
await agent.open('http://localhost:3000/chat');
|
||||
|
||||
// Login first
|
||||
await agent.act('Log in with Bluesky');
|
||||
|
||||
// Start voice mode
|
||||
await agent.act('Click "Start Voice Conversation"');
|
||||
await agent.check('Button shows "Generating speech..." or "Listening..."');
|
||||
|
||||
// Skip initial greeting if playing
|
||||
const skipVisible = await agent.check('Skip button is visible', { optional: true });
|
||||
if (skipVisible) {
|
||||
await agent.act('Click Skip button');
|
||||
}
|
||||
await agent.check('Button shows "Listening... Start speaking"');
|
||||
|
||||
// Simulate user speech
|
||||
await agent.act('Click "Simulate User Speech" test button');
|
||||
await agent.check('Button shows "Speaking..."');
|
||||
|
||||
await agent.act('Click "Simulate Silence" test button');
|
||||
await agent.check('Button shows "Processing..."');
|
||||
|
||||
// Wait for AI response
|
||||
await agent.wait(5000);
|
||||
await agent.check('AI message appears in chat');
|
||||
await agent.check('Button shows "Generating speech..." or "AI is speaking..."');
|
||||
|
||||
// Skip AI audio
|
||||
await agent.act('Click Skip button');
|
||||
await agent.check('Button shows "Listening... Start speaking"');
|
||||
|
||||
// Second exchange
|
||||
await agent.act('Click "Simulate User Speech" test button');
|
||||
await agent.act('Click "Simulate Silence" test button');
|
||||
|
||||
// Let AI audio play completely this time
|
||||
await agent.wait(10000);
|
||||
await agent.check('Button shows "Listening... Start speaking"');
|
||||
});
|
||||
```
|
||||
|
||||
## Phase 5: Validation
|
||||
|
||||
### Checklist
|
||||
- [ ] State machine is serializable (can be visualized in Stately)
|
||||
- [ ] No refs used in processing logic
|
||||
- [ ] Latest message only plays once per session
|
||||
- [ ] Skip works instantly in both aiGenerating and aiSpeaking
|
||||
- [ ] Re-entering voice mode plays most recent AI message (if not already spoken)
|
||||
- [ ] All test cases from PRD pass
|
||||
- [ ] Playwright test passes
|
||||
|
||||
## Implementation Order
|
||||
|
||||
1. Add test buttons to UI (for manual testing)
|
||||
2. Rewrite processing effect with simple messages array logic
|
||||
3. Remove submittingTranscriptRef completely
|
||||
4. Test manually with test buttons
|
||||
5. Write Playwright test
|
||||
6. Run and validate Playwright test
|
||||
7. Clean up any remaining issues
|
||||
Reference in New Issue
Block a user