feat: Fix grapheme splitting and add automatic UMAP calculation

Critical fixes for core functionality:

1. Fixed grapheme-aware text splitting (app/api/nodes/route.ts)
   - Changed character-based substring to grapheme-ratio calculation
   - Now properly handles emojis and multi-byte characters
   - Prevents posts from exceeding 300 grapheme Bluesky limit
   - Added comprehensive logging for debugging

2. Automatic UMAP coordinate calculation (app/api/nodes/route.ts)
   - Triggers /api/calculate-graph automatically after node creation
   - Only when user has 3+ nodes with embeddings (UMAP minimum)
   - Non-blocking background process
   - Eliminates need for manual "Calculate Graph" button
   - Galaxy visualization ready on first visit

3. Simplified galaxy route (app/api/galaxy/route.ts)
   - Removed auto-trigger logic (now handled on insertion)
   - Simply returns existing coordinates
   - More efficient, no redundant calculations

4. Added idempotency (app/api/calculate-graph/route.ts)
   - Safe to call multiple times
   - Returns early if all nodes already have coordinates
   - Better logging for debugging

Implementation plans documented in /plans directory.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-11-09 20:19:20 +00:00
parent 6bd0fe65e2
commit d8a975122f
6 changed files with 346 additions and 39 deletions

View File

@@ -0,0 +1,90 @@
# Plan: Fix Coords Computation (Core Functionality)
**Priority:** CRITICAL - This is core functionality of the app
## Current Architecture (Broken)
1. Nodes created with `coords_3d = NONE`
2. User visits `/galaxy`
3. Galaxy route checks if unmapped nodes exist
4. If yes, triggers `/api/calculate-graph` in background
5. Coordinates may not be ready on first visit
6. UMAP runs every time someone visits with unmapped nodes
### Problems
- **Inefficient**: Multiple users trigger same calculation
- **Poor UX**: Galaxy empty on first visit, needs refresh
- **Wasteful**: UMAP recalculation triggered unnecessarily
## Proposed Architecture (Correct)
**Trigger UMAP automatically on node insertion**
### Implementation
```typescript
// In POST /api/nodes, after creating node in SurrealDB:
// 1. Check total node count for this user
const countResult = await db.query(
'SELECT count() as total FROM node WHERE user_did = $did AND embedding != NONE',
{ did: userDid }
);
const totalNodes = countResult[0]?.[0]?.total || 0;
// 2. If we now have 3+ nodes, trigger coordinate calculation
if (totalNodes >= 3) {
// Don't await - let it run in background
fetch(`${process.env.NEXT_PUBLIC_APP_URL}/api/calculate-graph`, {
method: 'POST',
headers: {
'Cookie': `ponderants-auth=${surrealJwt}`,
},
}).catch(err => {
console.error('[POST /api/nodes] Background coord calculation failed:', err);
});
}
```
### Why 3 nodes minimum?
- UMAP requires minimum 3 data points for meaningful projection
- With <3 nodes, coords_3d stays NONE (galaxy shows "create more nodes" message)
## Implementation Steps
1. **Add node count check** after successful SurrealDB insert
2. **Trigger `/api/calculate-graph`** in background when threshold reached
3. **Remove auto-trigger logic** from `/api/galaxy` route
4. **Update `/api/calculate-graph`** to be idempotent (safe to call multiple times)
5. **Add rate limiting** to prevent spam calculations
## Edge Cases to Handle
### Concurrent inserts
**Problem**: Two users create nodes simultaneously
**Solution**: `/api/calculate-graph` checks count again before running UMAP
### Calculation in progress
**Problem**: Second node created while UMAP running
**Solution**: Add a lock/flag in DB to prevent concurrent UMAP runs
### Calculation failure
**Problem**: Network error, UMAP crashes
**Solution**: Retry logic with exponential backoff
## Files to Modify
- `app/api/nodes/route.ts` - Add trigger logic after node creation
- `app/api/galaxy/route.ts` - Remove auto-trigger, keep simple fetch
- `app/api/calculate-graph/route.ts` - Add idempotency check, locking mechanism
## Testing Requirements
1. Create 1st node verify coords_3d = NONE
2. Create 2nd node verify coords_3d = NONE
3. Create 3rd node verify `/api/calculate-graph` triggered
4. Wait for calculation verify all 3 nodes have coords_3d != NONE
5. Visit galaxy verify all nodes visible immediately
6. Create 4th node verify UMAP recalculates all 4 nodes