Files
app/plans/09-umap-minimum-nodes-analysis.md
Albert b96159ec02 docs: Add comprehensive implementation plans for all todo items
Created detailed markdown plans for all items in todo.md:

1. 01-playwright-scaffolding.md - Base Playwright infrastructure
2. 02-magnitude-tests-comprehensive.md - Complete test coverage
3. 03-stream-ai-to-deepgram-tts.md - TTS latency optimization
4. 04-fix-galaxy-node-clicking.md - Galaxy navigation bugs
5. 05-dark-light-mode-theme.md - Dark/light mode with dynamic favicons
6. 06-fix-double-border-desktop.md - UI polish
7. 07-delete-backup-files.md - Code cleanup
8. 08-ai-transition-to-edit.md - Intelligent node creation flow
9. 09-umap-minimum-nodes-analysis.md - Technical analysis

Each plan includes:
- Detailed problem analysis
- Proposed solutions with code examples
- Manual Playwright MCP testing strategy
- Magnitude test specifications
- Implementation steps
- Success criteria

Ready to implement in sequence.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 21:07:42 +00:00

6.9 KiB

Plan: Analysis - Why Wait for Three Nodes Before UMAP?

Priority: LOW - Analysis and potential optimization Dependencies: None Affects: User experience for early galaxy usage

Question

Why do we wait until the user has created 3 nodes before running UMAP to calculate 3D coordinates? Is this an arbitrary choice or is there a technical reason?

Current Implementation

// app/api/nodes/route.ts (lines 277-305)
const totalNodes = countResult[0]?.[0]?.total || 0;

if (totalNodes >= 3) {
  // Trigger UMAP calculation
  fetch(`${process.env.NEXT_PUBLIC_APP_URL}/api/calculate-graph`, {
    method: 'POST',
    headers: { 'Cookie': cookieHeader },
  });
}

UMAP Technical Requirements

Minimum Data Points

UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction algorithm. Let's investigate its minimum requirements:

  1. Mathematical minimum: UMAP needs at least nNeighbors + 1 data points

    • Our config: nNeighbors: Math.min(15, nodes.length - 1)
    • So technically we need minimum 2 points
  2. Practical minimum: With 1-2 points, the projection is trivial:

    • 1 point: Just sits at origin [0, 0, 0]
    • 2 points: Linear projection on single axis
    • 3+ points: Meaningful 3D spatial distribution
  3. Meaningful visualization: For interesting galaxy visualization:

    • 1 node: Just a single sphere (boring)
    • 2 nodes: Two spheres in a line (boring)
    • 3 nodes: Triangle configuration (starting to be interesting)
    • 4+ nodes: Complex 3D structure (compelling)

Options to Consider

Pros:

  • Ensures meaningful visualization
  • UMAP produces better results with more data
  • Avoids wasted computation on trivial cases
  • User has enough content to make galaxy worth exploring

Cons:

  • User can't see galaxy until 3rd node
  • Feels like arbitrary limitation

Option 2: Allow 1+ nodes (Calculate always)

Pros:

  • Galaxy available immediately
  • No waiting for 3rd node
  • Simpler logic

Cons:

  • 1-2 nodes produce boring visualization (single point, line)
  • Wasted UMAP computation on trivial cases
  • Poor user experience showing "empty" galaxy

Option 3: Fallback Layout for 1-2 Nodes

Pros:

  • Galaxy available immediately
  • 1-2 nodes get simple predetermined positions
  • UMAP kicks in at 3+ for interesting layout
  • Best of both worlds

Cons:

  • More complex implementation
  • Potential confusion when layout suddenly changes at 3rd node
function calculateNodePositions(nodes: NodeData[]): NodeData[] {
  if (nodes.length === 1) {
    // Single node at origin
    return [{
      ...nodes[0],
      coords_3d: [0, 0, 0],
    }];
  }

  if (nodes.length === 2) {
    // Two nodes on X axis
    return [
      { ...nodes[0], coords_3d: [-1, 0, 0] },
      { ...nodes[1], coords_3d: [1, 0, 0] },
    ];
  }

  // 3+ nodes: Use UMAP
  return runUMAP(nodes);
}

Option 4: Show Empty State with Onboarding

Pros:

  • Clear communication about galaxy feature
  • Educational for new users
  • No computation wasted
  • Encourages node creation

Cons:

  • More UI work
  • Doesn't solve the "when to calculate" question
// app/galaxy/page.tsx
if (nodes.length === 0) {
  return <EmptyState message="Create your first node to start your galaxy!" />;
}

if (nodes.length < 3) {
  return <PartialState message={`Create ${3 - nodes.length} more nodes to see your galaxy visualization!`} />;
}

Recommendation

Keep the 3-node minimum for the following reasons:

  1. User Experience

    • 1-2 nodes produce boring visualizations that don't showcase the galaxy feature
    • Better to show compelling visualization from the start
    • Empty state can explain "create 3 nodes to unlock galaxy"
  2. Technical Quality

    • UMAP produces better results with more data points
    • 3 points is mathematical minimum for interesting 3D distribution
    • Avoids wasted computation on trivial cases
  3. Product Story

    • Forces users to create meaningful content before "unlocking" visualization
    • Makes galaxy feel like a reward for engagement
    • Aligns with the product vision of "network of thoughts"

Potential Enhancements

1. Better Onboarding

// Show progress toward galaxy unlock
<Progress value={(nodes.length / 3) * 100} label={`${nodes.length}/3 nodes created`} />

2. Preview Mode

// Show static preview of galaxy with 1-2 nodes
<GalaxyPreview nodes={nodes} message="Create 1 more node to unlock 3D visualization!" />

3. Configurable Threshold

// Allow power users to adjust in settings
const UMAP_MINIMUM_NODES = userSettings.umapMinimum || 3;

Implementation (If Changing)

If we decide to implement Option 3 (fallback layout):

  1. Update calculate-graph logic

    if (nodes.length < 3) {
      return simpleLayout(nodes);
    }
    return umapLayout(nodes);
    
  2. Add simple layout function

    function simpleLayout(nodes: NodeData[]): NodeData[] {
      // Predetermined positions for 1-2 nodes
    }
    
  3. Update API response

    return NextResponse.json({
      nodes_mapped: nodes.length,
      layout_method: nodes.length < 3 ? 'simple' : 'umap',
    });
    

Testing

If implementing changes:

Playwright MCP Test

test('Galaxy works with 1-2 nodes', async ({ page }) => {
  // Create 1 node
  await createNode(page, 'First Node');
  await page.goto('/galaxy');
  await expect(page.locator('canvas')).toBeVisible();

  // Create 2nd node
  await createNode(page, 'Second Node');
  await page.goto('/galaxy');
  await expect(page.locator('canvas')).toBeVisible();

  // Should see 2 nodes
  const nodeCount = await getNodeCount(page);
  expect(nodeCount).toBe(2);
});

Magnitude Test

test('User understands galaxy requirement', async (agent) => {
  await agent.open('http://localhost:3000/galaxy');

  // With 0 nodes
  await agent.check('Sees message about creating nodes');
  await agent.check('Message says "3 nodes" or similar');

  // After creating 1 node
  await agent.act('Create first node');
  await agent.open('http://localhost:3000/galaxy');
  await agent.check('Sees progress toward 3 nodes');
});

Success Criteria

  • Clear documentation of why 3-node minimum exists
  • User understands requirement through UI/messaging
  • If changed: Works correctly with 1-2 nodes
  • If changed: Smooth transition to UMAP at 3+ nodes
  • Tests pass

Files to Update (if implementing changes)

  1. app/api/calculate-graph/route.ts - Add fallback layout logic
  2. app/galaxy/page.tsx - Add better onboarding messaging
  3. docs/architecture.md - Document decision (create if needed)

Files to Create

  1. docs/decisions/umap-minimum-nodes.md - Document the decision
  2. tests/playwright/galaxy-onboarding.spec.ts - If implementing changes
  3. tests/magnitude/galaxy-onboarding.mag.ts - If implementing changes