fix: eliminate reconnect duplicate logs with KMP dedup (FAR-105) #4

Closed
farhoodliquor-paperclip[bot] wants to merge 1 commits from fix/far-105-dedup-kmp-v2 into master
farhoodliquor-paperclip[bot] commented 2026-04-21 20:14:25 +00:00 (Migrated from github.com)

Summary

  • Root cause: on every log stream reconnect, sinceSeconds causes K8s to re-stream an overlap window of already-sent content, which the previous code forwarded to onLog verbatim — causing repeated lines in the UI (e.g. 42× "Let me watch the new pod and see if it works.")
  • Fix: reconnect attempts now buffer incoming chunks instead of emitting immediately. After the stream ends, findNewLogContent() strips any prefix of the buffered data that overlaps with already-sent content (using the KMP failure-function in O(N)), and only the genuinely new suffix is forwarded to onLog.
  • First-attempt streaming is unchanged (still real-time, no buffering).
  • accumulated replaces allChunks as the single source of truth for both dedup and the final parsed output — so the parsed JSON is also deduplication-clean.

How it works

existingContent: "...ABCDE"   (tail of what was already emitted)
reconnectContent: "BCDE_NEW"  (sinceSeconds overlap + genuinely new)

KMP finds overlap = "BCDE" (4 bytes) → emits only "_NEW"

Test plan

  • npm run typecheck — clean
  • npm test — 200/200 passed
  • Observe a run that previously showed repeated lines; confirm they no longer appear

🤖 Generated with Claude Code

## Summary - **Root cause**: on every log stream reconnect, `sinceSeconds` causes K8s to re-stream an overlap window of already-sent content, which the previous code forwarded to `onLog` verbatim — causing repeated lines in the UI (e.g. 42× \"Let me watch the new pod and see if it works.\") - **Fix**: reconnect attempts now **buffer** incoming chunks instead of emitting immediately. After the stream ends, `findNewLogContent()` strips any prefix of the buffered data that overlaps with already-sent content (using the KMP failure-function in O(N)), and only the genuinely new suffix is forwarded to `onLog`. - First-attempt streaming is unchanged (still real-time, no buffering). - `accumulated` replaces `allChunks` as the single source of truth for both dedup and the final parsed output — so the parsed JSON is also deduplication-clean. ## How it works ``` existingContent: "...ABCDE" (tail of what was already emitted) reconnectContent: "BCDE_NEW" (sinceSeconds overlap + genuinely new) KMP finds overlap = "BCDE" (4 bytes) → emits only "_NEW" ``` ## Test plan - [x] `npm run typecheck` — clean - [x] `npm test` — 200/200 passed - [ ] Observe a run that previously showed repeated lines; confirm they no longer appear 🤖 Generated with [Claude Code](https://claude.com/claude-code)

Pull request closed

Sign in to join this conversation.