Files

Albert f0284ef813 feat: Improve UI layout and navigation

- Increase logo size (48x48 desktop, 56x56 mobile) for better visibility
- Add logo as favicon
- Add logo to mobile header
- Move user menu to navigation bars (sidebar on desktop, bottom bar on mobile)
- Fix desktop chat layout - container structure prevents voice controls cutoff
- Fix mobile bottom bar - use icon-only ActionIcons instead of truncated text buttons
- Hide Create Node/New Conversation buttons on mobile to save header space
- Make fixed header and voice controls work properly with containers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-09 14:43:11 +00:00

4.6 KiB

Raw Blame History

Voice Mode Implementation Plan

Phase 1: Clean State Machine

Step 1: Rewrite state machine definition

Remove all unnecessary complexity
Clear state hierarchy
Simple event handlers
Proper tags on all states

Step 2: Add test buttons to UI

Button: "Skip to Listening" - sends START_LISTENING
Button: "Simulate User Speech" - sends USER_STARTED_SPEAKING
Button: "Simulate Silence" - sends SILENCE_TIMEOUT
Button: "Simulate AI Response" - sends AI_RESPONSE_READY with test data
Button: "Skip Audio" - sends SKIP_AUDIO (already exists)
Display: Current state value and tags

Phase 2: Fix Processing Logic

Problem Analysis

Current issue: The processing effect is too complex and uses refs incorrectly.

Solution

Simple rule: In processing state, check messages array:

If last message is NOT user with our transcript → submit
If last message IS user with our transcript AND second-to-last is assistant → play that assistant message
Otherwise → wait

Implementation:

useEffect(() => {
  if (!state.hasTag('processing')) return;
  if (status !== 'ready') return;

  const transcript = state.context.transcript;
  if (!transcript) return;

  // Check last 2 messages
  const lastMsg = messages[messages.length - 1];
  const secondLastMsg = messages[messages.length - 2];

  // Case 1: Need to submit user message
  if (!lastMsg || lastMsg.role !== 'user' || getText(lastMsg) !== transcript) {
    submitUserInput();
    return;
  }

  // Case 2: User message submitted, check for AI response
  if (secondLastMsg && secondLastMsg.role === 'assistant') {
    const aiMsg = secondLastMsg;

    // Only play if we haven't played this exact message in this session
    if (state.context.lastSpokenMessageId !== aiMsg.id) {
      const text = getText(aiMsg);
      send({ type: 'AI_RESPONSE_READY', messageId: aiMsg.id, text });
      playAudio(text, aiMsg.id);
    }
  }
  // Otherwise, still waiting for AI response
}, [messages, state, status]);

No refs needed! Just check the messages array directly.

Phase 3: Clean Audio Management

Step 1: Simplify audio cancellation

Keep shouldCancelAudioRef
Call stopAllAudio() when leaving canSkipAudio states
playAudio() checks cancel flag at each await

Step 2: Effect cleanup

Remove submittingTranscriptRef completely
Remove the "reset ref when leaving processing" effect
Rely only on messages array state

Phase 4: Testing with Playwright

Test Script

test('Voice mode conversation flow', async (agent) => {
  await agent.open('http://localhost:3000/chat');

  // Login first
  await agent.act('Log in with Bluesky');

  // Start voice mode
  await agent.act('Click "Start Voice Conversation"');
  await agent.check('Button shows "Generating speech..." or "Listening..."');

  // Skip initial greeting if playing
  const skipVisible = await agent.check('Skip button is visible', { optional: true });
  if (skipVisible) {
    await agent.act('Click Skip button');
  }
  await agent.check('Button shows "Listening... Start speaking"');

  // Simulate user speech
  await agent.act('Click "Simulate User Speech" test button');
  await agent.check('Button shows "Speaking..."');

  await agent.act('Click "Simulate Silence" test button');
  await agent.check('Button shows "Processing..."');

  // Wait for AI response
  await agent.wait(5000);
  await agent.check('AI message appears in chat');
  await agent.check('Button shows "Generating speech..." or "AI is speaking..."');

  // Skip AI audio
  await agent.act('Click Skip button');
  await agent.check('Button shows "Listening... Start speaking"');

  // Second exchange
  await agent.act('Click "Simulate User Speech" test button');
  await agent.act('Click "Simulate Silence" test button');

  // Let AI audio play completely this time
  await agent.wait(10000);
  await agent.check('Button shows "Listening... Start speaking"');
});

Phase 5: Validation

Checklist

State machine is serializable (can be visualized in Stately)
No refs used in processing logic
Latest message only plays once per session
Skip works instantly in both aiGenerating and aiSpeaking
Re-entering voice mode plays most recent AI message (if not already spoken)
All test cases from PRD pass
Playwright test passes

Implementation Order

Add test buttons to UI (for manual testing)
Rewrite processing effect with simple messages array logic
Remove submittingTranscriptRef completely
Test manually with test buttons
Write Playwright test
Run and validate Playwright test
Clean up any remaining issues

4.6 KiB Raw Blame History