Skip to content

Release v1.53.0#226

Open
atomantic wants to merge 52 commits into
releasefrom
main
Open

Release v1.53.0#226
atomantic wants to merge 52 commits into
releasefrom
main

Conversation

@atomantic
Copy link
Copy Markdown
Owner

Release v1.53.0

Released: 2026-05-12

Full Diff: v1.52.0...v1.53.0

Highlights

Added — pipeline

  • Story-arc planning: schema + LLM-driven arc/season/verify passes + Arc canvas UI (Story Arc Planning Phases 1-4)
  • Live per-panel/scene thumbnails during diffusion
  • Single-scene storyboard video render
  • Voice stage navigation (pipeline_next_stage / pipeline_prev_stage / pipeline_open_stage)
  • Dynamic sidebar with recent issues + AI prompt refine on every comic panel / storyboard scene
  • World Builder → Pipeline bible bridge (logline / premise / styleNotes auto-fill on new series)
  • Writers Room ↔ Pipeline "Promote to pipeline" + 7-step DRY unification (shared stageRunner / bibleExtractor / sceneExtractor / scenePrompt / storyBible factory / prompt-partials)

Added — voice

  • ui_read (read the page aloud)
  • Destructive-action confirmation gate
  • Proactive Chief-of-Staff speech (POST /api/voice/speak, quiet hours, barge-in)

Added — media

  • Favorite + note any image or video, surfaced across all four gallery surfaces
  • Codex hi-res presets (1024×1536 / 1536×1024 / 1536×1536) + image-gen batch mode (× N)
  • Per-element render buttons in World Builder; Refine Prompt action in Media Lightbox

Changed — media

  • World Builder is now a standalone Create page at /world-builder (/media/world-builder still works)
  • Pipeline Series two-pane layout (sticky bible sidebar + responsive card grid)
  • Bulk actions on a collection's contents (Star / Move / Copy / Remove); implicit "Unsorted" collection
  • Unified MediaLightbox wiring across all four gallery pages; mobile lightbox UX overhaul
  • Backup snapshots now namespaced by machine hostname — backward-incompatible disk layout (no auto-migration; recoverable by hand)

Fixed

  • CD-evaluate prompt drift migration (idempotent backfill for existing installs)
  • Snake_case LoRA sidecars now surface in chips / search / requeue
  • Pipeline issues no longer stuck in running after server restart
  • CoS agent dedup race in spawnAgentForTask
  • Proactive TTS no longer round-trips through the mic as user input (this PR — caught by pre-release /do:review)
  • DRY drifts against newly-shared modules (this PR): civitaiSuggestions uses canonical RUNNER_FAMILIES, messageEvaluator routes through shared extractJson, deepMerge rejects prototype-pollution keys

Pre-release review

This branch was reviewed by 5 parallel /do:review agents on the post-#224 batch (80 files, 5K lines) before opening this PR — 4 fixes landed in commit ab1ebe66 (proactive-TTS echo gap + 3 DRY/security drifts). /simplify followed in edf50acc. A release-coherence pass surfaced 4 missing PR #224 changelog entries, addressed in 3c02e027.

Tests: 4344 passing (server). Client build: green.

See .changelog/v1.53.0.md for the full per-feature changelog.

atomantic and others added 30 commits May 10, 2026 14:07
Adds gpt-image-1's native hi-res tier (1024×1536, 1536×1024, 1536×1536)
to the image-gen resolution dropdown and appends a "(high quality)" hint
to the codex prompt when either dimension is ≥1536, so the underlying
image_gen tool renders at full fidelity instead of its 1024 default.

The shared resolution list now also carries per-entry compatibility
metadata so the dropdown only offers sizes the selected backend can
actually handle: Flux-bucketed aspects (832×1216 / 1216×832) hide for
codex, Z-Image-Turbo, and ERNIE; the 1536 hi-res tier only shows for
codex and Flux 2. A stale incompatible w/h falls through to a "(custom)"
option so the value stays visible until the user picks a supported one.
The /api/media-jobs route was sorting every job by `startedAt || queuedAt`
DESC, which slotted an active job (earlier startedAt) below a fresher
batch of queued jobs (newer queuedAt). With a long-running codex render
plus a follow-up batch the UI showed positions out of execution order
(e.g. #6, #5, #4, RUNNING, #3, #2).

Preserve `listJobs` order — [running, codexRunning, ...queue] — for live
jobs so the list reads top-to-bottom as "currently rendering, then next
in line". Only the terminal (archive) portion still sorts by most-recent
finish so the recent-reel surfaces newest-first.
Adds a "× N" count input next to the Generate button on the Image Gen
page (async modes only — local + codex). One submit kicks off the first
job through the existing SSE-tracked path AND fires (N-1) extra enqueues
in parallel, all landing in the server's mediaJobQueue.

Server-side needs no changes: codex lane is single-flight so the extras
serialize behind the active render; local lane runs them per its own
concurrency limit. Seeds default to random per-job when the seed field
is blank, so a batch of N produces N variations rather than duplicates.

Renames handleQueueAdditional → queueAdditional(count) and fires the
submissions via Promise.all so the queue-position toasts surface
immediately while the first render continues streaming progress.
Adds a full-screen toggle (F key or top-right button) that drops the
modal's rounding/padding and lets the image fill the viewport. Horizontal
touch swipes (≥60px, horizontal-dominant) navigate prev/next; a tap
reveals the settings as a slide-in drawer so users can peek metadata
without exiting full-screen. Layered Escape closes drawer → exits
full-screen → closes lightbox, matching the user's mental model of
"close the topmost thing first".

The Seed row in the settings pane now has an inline copy button so it
can be pasted into a remix without retyping. Codex images still show no
seed because gpt-image-1 doesn't expose one — that's by design, not
broken capture.

UI tweaks based on usability feedback:
- Prev/next chevrons are now transparent (text-white/40 → hover white)
  since their position makes them discoverable.
- Top-right full-screen toggle is a solid white pill with a black icon
  (z-30 to sit above the settings drawer) so it stays readable against
  letterbox bars and clickable while the drawer is open.
- Dropped the redundant X in full-screen — the Minimize button already
  exits full-screen, and "close the entire lightbox" is one Esc away.
- Rename gpt-image-1 → gpt-image-2 in comments, copy, and changelog entry
  to match the actual model the Codex CLI invokes.
- Codex sidecar metadata now captures the codex CLI sessionId; Media
  Lightbox surfaces it as a copyable 'Codex session' row, and the Seed
  row reads 'n/a (gpt-image-2)' for codex images instead of disappearing.
Previously nested at /media/world-builder behind Media Gen's tab strip;
now reachable as a top-level Create sidebar entry. The legacy nested
route still renders the same component so existing deep links keep
working. Nav manifest entry moves from nav.media.world-builder to
nav.create.world-builder so palette + voice resolve to the new path.

Sets up the upcoming Pipeline page, which deep-links to World Builder
without descending into Media Gen.
Adds a top-level /pipeline page that chains a rough idea seed through
prose → comic-book script + TV teleplay (parallel) → comic pages +
storyboards (image gen), with a shared series bible (logline, premise,
characters, optional World ref, style notes) that every issue inherits
into its stage prompt context for narrative + visual consistency.

Server:
- New services under server/services/pipeline/: series.js (CRUD),
  issues.js (CRUD + per-stage helpers + stage IDs), textStages.js
  (single-stage LLM execution via promptService + active provider),
  autoRunner.js (idea → prose → scripts chain with SSE progress and
  cancel; one try/catch at the async boundary per the
  nodejs-async-event-listener-unhandled-rejection pattern),
  visualStages.js (thin image-gen handoff for comic pages + storyboards).
- New routes/pipeline.js mounted at /api/pipeline (Zod validation,
  series CRUD, issue CRUD, stage generate, visual enqueue, auto-run
  start/progress/cancel).
- 4 new prompt stage templates under data.sample/prompts/stages/:
  pipeline-idea-expansion, pipeline-prose, pipeline-comic-script,
  pipeline-tv-script; merged into stage-config.json automatically by
  scripts/setup-data.js without clobbering user customizations.
- 53 new tests (3 service files, 1 route file, 1 coordinator).

Client:
- New pages: Pipeline.jsx (series index), PipelineSeries.jsx (bible
  editor + issue list), PipelineIssue.jsx (URL-driven stage-tabbed
  page at /pipeline/issues/:id/:stage with auto-run button + SSE
  progress).
- 7 stage panel components (one per stage) sharing a TextStagePanel
  base for the four text stages.
- apiPipeline.js + usePipelineAutoRunProgress hook for SSE.
- Sidebar link + nav manifest entry (Create → Pipeline). The `pipeline`
  alias migrates from CoS Workflow to this new page; CoS Workflow
  keeps `pipeline` as a palette keyword.

Episode-video stitching via Creative Director, RunwayML, scene-arc
grouping, per-panel SSE preview, and other follow-ups documented in
PLAN.md → "Pipeline — Deferred". MVP auto-run intentionally stops
after the text stages to avoid burning GPU minutes on un-reviewed
content; visual stages remain manual per panel/scene.
…ow-ups (#215)

* feat: creative pipeline — Writers Room ↔ Pipeline DRY unification + episode video

Lands all seven steps of the Writers Room ↔ Pipeline DRY Unification track
plus the episodeVideo → Creative Director handoff.

Shared infrastructure (server/lib/):
- scenePrompt.js — canonical image-prompt composer (style → series → setting
  baseline → character physical descriptions → scene visual prompt), client
  mirror at client/src/lib/scenePrompt.js
- storyBible.js — canonical Character/Setting/Object shape + sanitizeBibleList
  + mergeExtractedBible (with per-kind dedup keys: slugline for settings,
  name for characters/objects). Writers-room characters/settings/objects
  collapse onto these helpers (~300 LOC of duplicated merge/sanitize gone)
- stageRunner.js — single staged-LLM entry point (replaces writers-room's
  bespoke callApiProvider + callCliProvider + buildCliInvocation, ~150 LOC of
  CLI-spawn drift deleted). Writers-room analyses now persist transcripts to
  data/runs/ and pick up tier-name model resolution for free
- bibleExtractor.js — single extract→sanitize entry, caller persists.
  Writers Room and Pipeline both use it
- sceneExtractor.js — single LLM-driven scene-list extractor with two source
  modes (prose via writers-room-script.md, tvScript via the new
  pipeline-extract-scenes.md). Per-field length caps so a runaway LLM can't
  blow STAGE_OUTPUT_MAX
- promptPartials.js — Mustache-style `{{> name }}` includes pre-expanded
  before applyTemplate. Two partials shipped: bible-deference and
  scene-output-contract. Recursive with MAX_DEPTH=8 cycle guard

Pipeline:
- Series gains settings + objects bibles (alongside the existing characters
  bible). Slugline-matched settings auto-prepend to storyboard scene prompts
- ProseStage gains an "Extract bibles" button — one click fans out three
  LLM passes (characters / settings / objects) and merges into the series
- StoryboardsStage gains "From TV script" / "From prose" buttons with a
  two-click-arm replace guard. Auto-fills stages.storyboards.scenes via the
  shared sceneExtractor
- episodeVideo stage wired end-to-end to Creative Director — creates a CD
  project from storyboards (i2v continuation from scene 2+, autoAcceptScenes,
  disableAudio), polls progress, renders the final mp4 inline when complete
- autoRunner.js gains optional includeVideo flag that fires the CD handoff
  after text stages complete; second "Run everything (incl. video)" button
  on the issue page opts in
- Side fix: issuePatchSchema.stages z.union order swapped so visual-stage
  PATCH payloads (scenes/pages/cdProjectId/videoPath) no longer silently
  strip — the union evaluated the strict text-only schema first
- Side fix: series.js + issues.js switched to lazy statePath() resolution
  (matches writers-room/local.js convention) so Proxy-based PATHS mocks load

Writers Room:
- Evaluator script kind delegates to sceneExtractor; characters/settings/
  objects kinds delegate to bibleExtractor (deleted the local SHAPERS for
  both)
- New promoteToPipeline.js#promoteWorkToPipeline(workId, { force }) creates
  a pipeline series + first issue from a work, carrying over prose, all
  three bibles, and the latest script-analysis scenes. Bidirectional id
  link recorded on both sides (manifest.pipelineSeriesId/pipelineIssueId,
  series.writersRoomWorkId). Idempotent
- WorkEditor menu: "Promote to pipeline" flips to "Open in pipeline" once
  linked. PipelineSeries header surfaces a "Writers Room" badge linking back

Prompts:
- data.sample/prompts/_partials/bible-deference.md (character + setting
  bible deference preamble — shared between writers-room-script and
  pipeline-extract-scenes)
- data.sample/prompts/_partials/scene-output-contract.md (canonical
  scenes[] JSON output shape)
- data.sample/prompts/stages/pipeline-extract-scenes.md (new — parses
  already-formatted teleplays into structured scenes)
- writers-room-script.md + pipeline-extract-scenes.md refactored to use
  the partials — bug-fixes to "defer to canonical character names" or the
  scenes JSON contract now land in one place
- pipeline-extract-scenes registered in stage-config.json

Routes:
- POST /api/pipeline/series/:id/extract-bible — runs bibleExtractor across
  the requested kinds (default: all three), merges into the series
- POST /api/pipeline/issues/:id/stages/storyboards/extract-scenes — runs
  sceneExtractor against the issue's prose or tvScript stage output, replaces
  stages.storyboards.scenes (409 + force flag for replacing existing scenes)
- POST /api/pipeline/issues/:id/stages/episodeVideo/visual — no longer 501;
  hands off to Creative Director
- POST /api/writers-room/works/:id/promote-to-pipeline — creates series +
  first issue from a WR work, returns reused=true for an idempotent re-call

Test counts: 3746 → 3793 (+47). Full server pack green; client builds clean.

* perf(creative-director): slim ?slim=1 mode for polling consumers

Pipeline EpisodeVideoStage polls the CD project every 4s. The full GET
returns `runs[]` (unbounded — grows with every render op) + the full
treatment text. The UI consumes only status/updatedAt/per-scene status/
finalVideoId/failureReason.

- Add `?slim=1` to GET /api/creative-director/:id — returns the slim
  projection. Original (full) response unchanged when slim is absent.
- Client `getCreativeDirectorProject(id, { slim: true })`.
- EpisodeVideoStage poll uses slim mode.
- 2 route tests covering the slim projection and the no-treatment case.

* refactor(creative-director): extract ScenePreview + shared scene-status helpers

- New ScenePreview.jsx owns the <video controls poster> + onError missing-
  media + Retry + cache-bust pattern. Two consumers: SegmentsTab (CD
  detail) and Pipeline EpisodeVideoStage.
- New sceneStatus.js exports SCENE_STATUS, SCENE_STATUS_BADGE,
  SCENE_STATUS_LABEL, getSceneStatusBadge, PROJECT_STATUS_LABEL.
- Pipeline EpisodeVideoStage final-video render now has the missing-mp4
  fallback for free (was hand-rolled <video> without it). Pipeline +
  CD now share one badge palette so "rendering" / "done" / "checking"
  look identical in both surfaces.

Side fix: the slim ?slim=1 CD projection was returning `scene.id` but
CD scenes are keyed by `sceneId`. Corrected the projection + its test
fixture so polling consumers actually receive the field they read.

* feat(pipeline): aspectRatio + quality pickers on EpisodeVideoStage

The route already accepts aspectRatio (16:9 / 9:16 / 1:1) and quality
(draft / standard / high) via creativeDirectorPresets — just surface them
on the EpisodeVideoStage toolbar so users don't have to use server defaults.

Pickers hide once a CD project exists (its settings are immutable
post-creation; force-restart starts a fresh project that re-reads the
current picker state).

modelId picker deferred — needs a flagged-stable video-model list in the
registry before the UI can hide preview / experimental entries.

* fix(pipeline): demote stuck running issues on boot

If the server restarts mid auto-run, the in-memory `runs` map in
autoRunner.js is lost. The issue stays in `status: 'running'` forever
and the UI spinner never resolves (SSE clients can't reattach to a
dead run).

- New recoverStuckAutoRuns() walks every issue at boot and demotes any
  `running` to `needs-review` (the same terminal state a normal
  completion lands on). The user clicks "Run again" if they want to
  retry.
- Wired into server/index.js next to the existing brain + writers-room
  boot recovery calls.
- 2 new tests covering the recovery + the "nothing stuck" idempotency.

Not implemented: actual resume. Re-attaching SSE and re-running the
missing stages would need the persisted runId + per-stage progress,
neither of which we write to disk. Demoting to needs-review is the
safe low-blast-radius fix; resume could land later if the demand
materializes.

* refactor(writers-room, world-builder): drop sceneCardHelpers shim + share composeStyledPrompt

Two small DRY follow-ups to the WR↔Pipeline unification (items 1a + 1b
of the unification track).

1a — Drop the sceneCardHelpers.js shim:
- Constants (WR_IMAGE_DEFAULTS, STYLE_ID, EMPTY_IMAGE_STYLE,
  readWrImageSettings) move to client/src/lib/wrImageDefaults.js.
- SceneCard.jsx + StoryboardPanel.jsx import buildScenePrompt /
  matchSceneCharacters / matchSceneSetting / normCharKey directly
  from client/src/lib/scenePrompt.
- sceneCardHelpers.js deleted.

1b — Mirror composeStyledPrompt to server/lib/:
- server/lib/composeStyledPrompt.js is the verbatim mirror of
  client/src/lib/composeStyledPrompt.js (same client↔server pattern
  as scenePrompt).
- worldBuilder.js#compileBatchPrompts uses it for both the variations
  and composite-sheets paths.
- Side effect: rendered prompts switch separator from `, ` to `. `
  (the composeStyledPrompt convention shared with scenePrompt).
  Semantically identical for diffusion models. Two pinned test
  assertions updated.
- 6 new unit tests.

* feat(pipeline): opt-in parallel mode for /series/:id/extract-bible

The route runs the three bible kinds (characters/settings/objects)
sequentially by default — safe for CLI providers (codex / claude-code /
gemini-cli) that serialize at the provider session anyway.

For HTTP-API providers (OpenAI / Anthropic / LM Studio HTTP) three
concurrent calls cut wall-clock by ~3×. Opt-in via { parallel: true }
in the request body. The merge step always runs after all extractions
complete; series.<field> is read once at the top of the route so the
merge baseline doesn't race.

2 new route tests:
- parallel guarantee: every extractBible() start fires before the
  first finish
- sequential guarantee: events alternate start/finish/start/finish/...

Item 4b of the Writers Room ↔ Pipeline DRY unification track.

* refactor(pipeline): lift bible extract → merge → patch into series.js

The /series/:id/extract-bible route threaded the orchestration itself:
load series, run extractBible per kind, mergeExtractedBible per kind,
patch via updateSeries. That made mergeExtractedBible + extractBible
route-layer imports — the wrong layer.

Lift into server/services/pipeline/series.js#extractAndMergeIntoSeries(
  seriesId, { kinds, corpus, parallel, providerOverride }
). Route shrinks to one delegate call after the source-resolution
(corpus or issue.prose.output) and series-mismatch guards.

Mirrors how writers-room exposes mergeExtractedCharacters(workId,
extracted) rather than the lower-level merge primitive. Pure refactor
— all 23 route tests + the existing extract-bible behavior tests pass
unchanged.

Item 4a of the Writers Room ↔ Pipeline DRY unification track.

* docs(plan): scope Pipeline Story Arc Planning + Series Page Redesign

User-requested addition: redesign the PipelineSeries page to use more of
the viewport and add an LLM-driven top-down arc generation system. Today
the page is capped at max-w-5xl with a one-line "New issue" input — the
right primitive for an 8-issue arc, the wrong primitive for a 24-episode
3-season show.

The new initiative replaces "type a title → New issue" with:
  plan whole arc → set season count → generate season breakdown
  → generate per-season episode list → drill into individual episodes

Scoped in 5 phases so each is independently reviewable:
  1. Series page layout redesign (no schema change)
  2. Arc + Season schema (Series → Arc → Season → Issue)
  3. LLM-assisted arc generation (3 new staged prompts:
     arc-overview / season-episodes / arc-verify)
  4. Arc canvas UI replacing the flat issues list
  5. Cross-season continuity hooks (prompt context follow-up)

Supersedes the prior one-line "Series-arc grouping" backlog item.

* address copilot review: kind-aware merge sort + CLI model threading + unused imports

PR #215 round-1 Copilot feedback (4 threads):

1. server/lib/storyBible.js:352 — mergeExtractedBible sorts by `name`, but
   settings can have empty `name` while `slugline` is the primary
   identifier. Pure name-sort floats slugline-only settings to the top
   AND diverges from writersRoom/settings.js#listSettings which orders by
   `slugline || name`. Fix: per-kind sort key (slugline||name for
   settings, name otherwise). 2 new regression tests.

2. server/services/writersRoom/settings.js:141 — same root cause as #1.
   Now that mergeExtractedBible's sort matches listSettings,
   mergeExtractedSettings returning state.settings directly is consistent.

3. server/services/writersRoom/promoteToPipeline.js:28 — drop unused
   `badRequest` / `notFound` imports from ./_shared.js.

4. server/lib/stageRunner.js:105 — executeCliRun reads
   `provider.defaultModel` for both buildCliArgs() and the
   run-started metadata hook. A tier-name resolution or stage-level
   modelOverride was ignored on the CLI path AND the run record logged
   the wrong model. Fix: pass a shallow-cloned provider with
   `defaultModel: resolvedModel` when the resolved model differs.
   Non-invasive — no toolkit changes needed.

Tests: 3805 → 3807 (+2).

* address copilot review round 2: toast.warning + stale copy + dead export

PR #215 round-2 Copilot feedback (4 threads):

1. client/src/components/pipeline/stages/StoryboardsStage.jsx:56 —
   toast.warning() doesn't exist on the Toast API; would throw at
   runtime the first time the replace-confirm path fired. Fix: add
   `warning` to the Toast API (sibling of success/error/loading) with
   a yellow ⚠ icon. Now valid + reusable across the codebase.

2. StoryboardsStage.jsx:104 — header copy still said "Scene-video
   rendering through Creative Director is deferred," but this PR wires
   the Pipeline Episode Video stage to CD. Updated to point users at
   the Episode Video stage for stitching.

3. client/src/components/writers-room/WorkEditor.jsx:401 — comment
   claimed getWritersRoomWork "doesn't echo the link fields yet."
   Server route already returns the full manifest (including
   pipelineSeriesId/pipelineIssueId) — comment was a stale TODO from
   an earlier draft. Rewrote to describe the actual behavior:
   optimistic patch for same-render UX; subsequent GETs already
   reflect the link.

4. server/services/pipeline/episodeVideo.js:27 — ERR_INVALID_SCENE
   was exported but never referenced. Dead code. Removed.

Tests: 3807 still passing.

* address copilot review round 3: prose-field overwrite + reuse scenes + timer leak + models fallback

PR #215 round-3 Copilot feedback (4 threads):

1. server/lib/storyBible.js:352 — mergeExtractedBible overwrote
   firstAppearance/evidence/missingFromProse unconditionally, but the
   sanitizer normalizes missing keys to null/[]. A partial LLM response
   that omits these keys would clear prior data. Fix: only overwrite
   when the key is actually present on rawIncoming. The
   "refresh-verbatim-including-explicit-nulls" semantics still hold
   (explicit null/[] in rawIncoming wins) — only the absent-key case
   is now preserve-existing instead of nuke.

2. server/services/pipeline/episodeVideo.js:87 — the reuse path
   returned { cdProjectId, reused: true } without a scenes count.
   autoRunner.js broadcasts kickoff.scenes which was undefined on
   reuse. Fix: emit a scenes count derived from current storyboards
   scenes with non-empty descriptions (same filter
   buildTreatmentFromStoryboards uses). SSE shape stable across paths.

3. client/src/components/pipeline/stages/StoryboardsStage.jsx:60 —
   the 5s arm-disarm setTimeout had no cleanup. Unmounting mid-window
   would trigger setState on an unmounted component. Fix: store the
   timer in a ref, clear on unmount, clear when starting a fresh arm,
   clear when the second click lands within the window.

4. server/lib/stageRunner.js:40 — resolveModel only fell back to
   defaultModel; some providers ship a `models` array with no
   defaultModel. The previous (pre-shared-runner) pipeline code fell
   back to models[0]. Fix: new providerFallbackModel() helper —
   defaultModel || models[0] || null. Both the no-hint and
   tier-missing paths use it. 3 new resolveModel tests.

Tests: 3807 → 3810 (+3).

* address copilot review round 4: expose parallel flag + persist render settings

PR #215 round-4 Copilot feedback (2 threads):

1. client/src/services/apiPipeline.js:41 — extractPipelineBibles
   didn't expose the server-side parallel flag I added in DRY 4b, so
   UI callers couldn't opt into the ~3× speedup without bypassing the
   typed wrapper. Fix: forward `parallel` through the POST body.

2. client/src/components/pipeline/stages/EpisodeVideoStage.jsx:35 —
   aspectRatio / quality were only selectable on first generate. After
   a page reload the state reset to 16:9 / standard and the pickers
   were hidden from the restart flow, so a forced restart would
   silently use those defaults with no way to adjust.
   Fix:
   - Persist `aspectRatio` + `quality` on stages.episodeVideo via the
     visual-stage sanitizer in pipeline/issues.js + the visual stage
     patch schema. startEpisodeVideoForIssue writes them when kicking
     off a CD project.
   - Stage UI initializes its picker state from the persisted values
     (falls back to defaults for an unstarted stage).
   - The confirm-restart UI now renders the same aspectRatio + quality
     pickers alongside the Yes/Cancel buttons so a forced restart
     reads the user's current choice.
   - 1 new test asserting persistence; existing happy-path test now
     also verifies the defaults land on the stage.

Tests: 3810 → 3811 (+1).

* address copilot review round 5: lock concurrent extracts + sync state + async fs

PR #215 round-5 Copilot feedback (4 threads):

1+2. client/.../StoryboardsStage.jsx:133,145 — TV-script button and
     prose button each only disabled when their OWN kind was extracting.
     Clicking the other during an in-flight extract started a concurrent
     request that could overwrite the first one's result (last-write-wins).
     Fix: derive a single `extractInFlight` flag from the state machine
     (`!!extractingFrom && !extractingFrom.startsWith('arm:')`) and gate
     both buttons on it.

3. client/.../EpisodeVideoStage.jsx:104 — onStageUpdate payload after a
   kickoff included only { status, cdProjectId }, but the server also
   persists aspectRatio/quality on the stage. Same-session
   navigate-away-and-back showed the restart picker at defaults until
   a full refetch landed. Fix: include aspectRatio + quality in the
   optimistic update so the client model matches what the server stored.

4. server/lib/promptPartials.js:89 — used sync existsSync(path) on every
   referenced partial before readFile. That's blocking fs I/O on the
   hot prompt-render path. Fix: try readFile + catch ENOENT (any other
   error escalates). Whole expansion stays async; existing
   "Prompt partial not found" semantics preserved via the sync
   expander's null-resolution path.

Tests: 3811 still passing.

* address copilot review round 6: mismatched-link guard + kind dedup + clear-cdProject-after-success

PR #215 round-6 Copilot feedback (3 threads):

1. server/services/writersRoom/promoteToPipeline.js:79 — idempotent
   fast-path returned the cached pair when both records existed but
   didn't verify the issue actually belongs to the series. A
   manually-edited / migration-corrupted manifest could return an
   unrelated issue. Fix: require `existingIssue.seriesId ===
   existingSeries.id`; a mismatch is treated like a missing record
   (drop the stale link, fall through to a fresh create). 1 new test
   covering the simulated corrupt-link case.

2. server/routes/pipeline.js:199 (via series.js#extractAndMergeIntoSeries)
   — `extract-bible` didn't dedup `kinds`. Duplicates would run extra
   LLM calls AND last-write-wins the merge for the same field. Fix:
   collapse via `[...new Set(rawKinds)]` in the service layer
   (preserves first-seen order so callers can rely on `results` key
   order). 1 new route test asserting only-unique-kinds dispatch.

3. client/.../EpisodeVideoStage.jsx:92 — `submit({ force: true })`
   cleared `cdProject` BEFORE the restart request landed. A failed
   restart tore the in-flight progress UI off the page even though
   the previous CD project was still rendering. Fix: only clear on a
   successful response.

Tests: 3811 → 3813 (+2).

* address copilot review round 7: scenes count cap + drop shadowed import

PR #215 round-7 Copilot feedback (2 threads):

1. server/services/pipeline/episodeVideo.js:93 — reuse-path scenes
   count filtered empty descriptions but didn't apply the MAX_SCENES=30
   cap that buildTreatmentFromStoryboards uses. For an issue with
   31+ usable storyboards, the reuse response (and SSE / UI messaging)
   reported more scenes than the actual CD treatment held. Fix:
   mirror buildTreatmentFromStoryboards exactly (filter + slice(0, MAX_SCENES))
   so the count matches what's in the project.

2. server/services/writersRoom/promoteToPipeline.test.js:144 — the
   mismatched-link test re-imported ../pipeline/issues.js as issuesSvc,
   shadowing the top-level import already in scope. Functionally
   redundant + confusing. Dropped the inner import.
Add a small accent-colored hint line under the FFLF Last frame panel
explaining that good keyframe pairs share scene geometry (same camera,
same subject). Helps users understand why picking unrelated images
produces a visual cut instead of smooth interpolation.

renderFramePanel now accepts an optional 'hint' prop alongside
'advisoryNote', rendered as a separate sibling line so the experimental
warning and the new guidance are visually distinct. The hint applies to
both runtimes (notapalindrome and dgrauet) since both benefit from
geometry-aligned keyframes.
…cy optgroup

Mark the three legacy notapalindrome mlx_video entries (ltx2_unified,
ltx23_unified, ltx23_distilled_q4) with `deprecated: true` in both the
in-code DEFAULT_REGISTRY and the data.sample seed file. The dgrauet
runtime supersedes them, but leaving them visible (under a Legacy
optgroup) keeps existing installs working until users opt in to
INSTALL_LTX2.

Updated dropdowns:
- VideoGen.jsx model select — non-deprecated models render in the
  normal flat list, deprecated ones move under <optgroup label='Legacy'>
- CreativeDirector.jsx model select — same optgroup pattern; the
  initial-modelId pick now prefers the first non-deprecated entry so
  new projects don't auto-start on a legacy backend.

defaultMacos left as 'ltx23_distilled_q4'. The dgrauet venv is opt-in
(INSTALL_LTX2=1 bash scripts/setup-image-video.sh), so changing the
shipped default would break fresh installs and existing users who
haven't run that step. Deprecation is enough — the dgrauet models are
already at the top of the list, so users will naturally migrate.
…re runner enqueue

`spawningTasks` was released by `spawnAgentForTask` immediately after the
task was flipped to `in_progress`, BEFORE `spawnViaRunner` / `spawnDirectly`
actually queued the agent. A concurrent `spawnAgentForTask(task)` call
landing in that window (e.g. a re-fired `task:ready` from a follow-up
scheduler tick) saw an empty set, fell through the dedup-has-check (which
never re-checks `task.status`), and spawned a SECOND agent for the same
task id — matching the long-E2E report of "⚠️ already being spawned" being
logged AND a duplicate agent appearing seconds later.

Fix: wrap the spawn call in try/finally and release the guard only after
the runner accepted the agent. `return await` (not bare `return`) ensures
the finally fires AFTER the spawn promise resolves, not synchronously on
return. `cleanupOnError` and the lane-acquire-fail branch still delete the
guard on early-error paths, where the spawn never starts.

Tests in server/services/agentLifecycle.test.js exercise the race with
inline replicas of the pre-fix and post-fix dedup slices (per the
subAgentSpawner.test.js convention) and assert against the source itself
that the delete lives in a `finally { ... }` wrapping the spawn call — so
a future refactor that reverts the placement flips the test red.
…tages

Collapse the repeated setLoading/.catch/setLoading pattern that was
duplicated across four pipeline stage components into a single
`useAsyncAction(fn, { errorMessage })` hook that returns `[run, running]`.

Migrated handlers:
- ProseStage.jsx — handleExtract
- TextStagePanel.jsx — handleGenerate, handleSave (handleGenerate now
  combines the hook's local running flag with a server-synced
  serverGenerating flag derived from stage.status === 'generating' so an
  auto-run kicked off elsewhere still locks the button)
- EpisodeVideoStage.jsx — submit (the force-aware error fallback is
  re-thrown from the action body so the hook's toaster still surfaces
  the right verb)

Skipped (would force a worse abstraction — these are keyed/indexed
loading states, not booleans):
- StoryboardsStage.jsx — savingIdx (which scene is saving) and
  extractingFrom (which source string is in flight, plus a two-click
  arm pattern that piggybacks on the same state). A different hook
  shape (e.g. useAsyncActionKeyed returning a `runningKey` string)
  would be needed; out of scope for this pass.
… bibleStore factory

Three writers-room domain files (characters.js, settings.js, objects.js)
each implemented loadFile/saveFile/list/get/create/update/remove/
mergeExtracted as near-byte-identical copies. Extracted createBibleStore
into server/lib/storyBible.js (the natural sibling to sanitizeBibleList +
mergeExtractedBible), parameterized by kind, idPrefix/idRegex, fileName,
listKey, dedupKey, primaryFields, editableFields, requireOnCreate, an
optional validateAfterUpdate (settings use it for the 'both name and
slugline blank → reject' guard), and per-kind error messages.

Each of the three domain files is now ~30 LOC of factory invocation +
re-export. The settings file keeps the historical normalizeSlugline
re-export and the ServerError-based validateAfterUpdate.

LOC: 394 → 113 across the three domain files (-281 LOC), factory adds
~150 LOC to storyBible.js. Net codebase reduction: -128 LOC.

Added 12 new tests to storyBible.test.js covering the factory directly
(CRUD, dedup conflicts at create/update, work-id traversal guards,
single- vs multi-primary-field paths, settings blank-both reject).

Full server pack: 3825 passed / 1 skipped.
Re-export the Writers Room character/setting/object create-schemas under
kind-neutral names (characterBibleCreateSchema, settingBibleCreateSchema,
objectBibleCreateSchema) so the Pipeline can validate bible arrays against
the same shapes as the Writers Room routes.

Pipeline now extends these in routes/pipeline.js with its own back-compat
fields (legacy 'description' alias on character, pipeline-only 'imageRefs',
pre-existing 'id') and uses .passthrough() so canonical sanitizer-emitted
fields ('evidence', 'firstAppearance', 'source', timestamps,
'missingFromProse') round-trip cleanly when the client re-saves an
existing series. The permissive 'bibleEntrySchema = z.record(...)' for
settings/objects is removed.

Because zod's .refine() returns a ZodEffects that has no .extend(), the
Writers Room setting create schema is split into an inner ZodObject
(used as the generic alias) plus a refined wrapper (preserves the
existing route-import surface). Pipeline does not re-apply the refine —
the canonical sanitizer in storyBible.js already drops settings that
lack both name and slugline, and re-applying the refine here would
tighten the route's existing permissive behavior.
…grid canvas

Phase 1 of the Story Arc Planning + Series Page Redesign. Drops the
max-w-5xl cap and splits PipelineSeries into a left bible sidebar
(sticky, internally scrollable, collapsible to a 48px rail at lg+) and
a right card-grid canvas for issues/episodes. Sidebar collapse persists
in localStorage. Mobile (<lg) stays single-column with the bible
reflowing above the canvas.

Issue cards show number, status badge, title (line-clamped), and
updated timeAgo. No schema or API change — pure UI refactor.
Phase 2 of the Story Arc Planning initiative. Series records grow two
optional fields:

  series.arc       = { logline, summary, themes[], protagonistArc, status }
  series.seasons[] = ordered list of { id, number, title, logline,
                     synopsis, episodeCountTarget, themes[], endingHook,
                     status, createdAt, updatedAt }

Issues gain optional seasonId + arcPosition pointers so an issue can
slot into a specific season at a specific ordinal.

New files:
  server/lib/storyArc.js           — canonical shapes + sanitizers
  server/services/pipeline/seasons.js — CRUD + delete-with-reassign

Routes:
  GET    /api/pipeline/series/:id/seasons
  POST   /api/pipeline/series/:id/seasons
  PATCH  /api/pipeline/series/:id/seasons/:seasonId
  DELETE /api/pipeline/series/:id/seasons/:seasonId   { reassignTo?: id | null }

Pure additive: existing series.json + issues.json files round-trip
unchanged through the sanitizers. No UI yet — Phase 4 builds the arc
canvas on top of this layer.

Tests: 18 storyArc unit + 13 seasons service + 12 new route tests.
Full server suite: 3873 passed, 1 skipped, 0 failed.
Phase 3 of the Story Arc Planning initiative. Three new LLM-driven
passes on top of the Phase 2 schema:

  POST /api/pipeline/series/:id/arc/generate
    → proposes series.arc + series.seasons[] from the series bible.
       model: heavy (most expensive single pipeline call).
       commit: true persists in one shot; default returns preview.

  POST /api/pipeline/series/:id/seasons/:seasonId/episodes/generate
    → proposes the per-episode breakdown for one season, taking the
       whole-arc protagonist trajectory + prior seasons' synopses as
       continuity context.
       commit: true creates one issue per episode with seasonId +
       arcPosition pre-filled and synopsis seeded into stages.idea.input.

  POST /api/pipeline/series/:id/arc/verify
    → cross-season continuity pass. Surfaces actionable
       { severity, location, problem, suggestion } issues for
       character contradictions, dropped subplots, episode-count vs
       arc-weight mismatch, finale-hook gaps, arc-role imbalance,
       theme drift.

New files:
  server/services/pipeline/arcPlanner.js
  data.sample/prompts/stages/pipeline-arc-overview.md
  data.sample/prompts/stages/pipeline-season-episodes.md
  data.sample/prompts/stages/pipeline-arc-verify.md
  + 3 stage entries in data.sample/prompts/stage-config.json
    (auto-merged into existing installs by scripts/setup-data.js)

All three prompts reuse _partials/bible-deference.md so character +
setting bible deference logic stays in one place. Service routes
through the shared stageRunner.js so each call is replayable from
/runs.

Issues service: createIssue() now forwards optional seasonId +
arcPosition (the season-episodes generator sets both at creation
time).

Tests: 11 arcPlanner unit + 5 new route tests. Full server suite:
3889 passed, 1 skipped, 0 failed.
Phase 4 of the Story Arc Planning redesign. Replaces the flat issue
card grid on PipelineSeries with a vertical Arc → Season → Episode
tree backed by the Phase 2 schema + Phase 3 LLM endpoints.

Top of canvas:
  - Arc card with logline, theme chips, collapsible summary +
    protagonist arc, edit-in-place fields
  - 'Generate arc' kicks off the pipeline-arc-overview LLM pass
    (two-click-arm replace when an arc already exists)
  - 'Verify arc' surfaces continuity findings inline with severity
    badges (high/medium/low) and dismissable panel

Per season:
  - Collapsible row (chevron) showing S#, title, N / target episodes
  - Inline edit for number, title, logline, synopsis, ending hook,
    status, episodeCountTarget
  - 'Generate episodes (LLM)' runs the pipeline-season-episodes pass
    when the season has context (logline or synopsis) AND no episodes
    yet; creates one issue per beat with seasonId + arcPosition set
  - Per-season delete with two-click-arm + child un-grouping

Per episode:
  - Compact row: E# (or fallback #), title, status badge, updated
    timeAgo, hover-revealed season picker, delete button, deep-link
    into the issue editor
  - Un-grouped issues (seasonId: null) land in a dedicated bottom
    panel so legacy issues stay visible during migration

New: client/src/components/pipeline/ArcCanvas.jsx
     client/src/services/apiPipeline.js — 7 new helpers wrapping the
     Phase 2 + 3 routes (seasons CRUD + arc/episodes/verify generators)
PipelineSeries.jsx slims down — bible sidebar stays put, the right
pane is now just <ArcCanvas series issues onSeriesUpdate onIssuesUpdate />.

Build + lint clean; server suite stays 3889 passed.
Phase 5 (seasonContext injection in textStages.js) deferred — macro
continuity is already implicit in the planning artifacts (arc
protagonist trajectory + upfront season synopses + the episode's
idea-stage seed from the per-season generator) and the bible system
enforces character/setting consistency. Revisit only if per-episode
prose breaks continuity with adjacent episodes in practice.
…rop dynamic import

Post-merge /simplify pass on the Story Arc Planning code:

- New hook client/src/hooks/useArmedAction.js encapsulates the
  two-click-arm confirm pattern (set true → 5s disarm → fire on
  second click). ArcCanvas.jsx had this open-coded three times
  (ArcHeader regenerate, SeasonRow delete, IssueRow delete). The
  hook clears the disarm timer on unmount, fixing a latent
  setState-after-unmount in IssueRow's hover-revealed flow.
- pipeline.js route schemas: arc/generate, season-episodes/generate,
  arc/verify all repeated { providerOverride, modelOverride }
  inline. Extracted to a shared providerOverrideShape object.
- ArcCanvas.jsx ArcContent#save was doing a dynamic
  await import('../../services/api') inside the handler. Hoisted
  updatePipelineSeries to the top-level import block.
- server/services/pipeline/seasons.js had its own isStr definition;
  storyBible.js already exports one. Import it instead.

No behavior change. Server suite 3889/3889 green; client lint + build
clean.
…le expansion (#216)

* feat(pipeline): comic-page rendering + parallel codex + bible expansion

- comicPages stage: extract pages/panels from the comic-script output
  via deterministic markdown parser; new full-page render path that
  composes a multi-panel layout prompt
- mediaJobQueue: codex lane parallel-limit (1-10, runtime-configurable
  via Settings → Image Gen); jobId-keyed cancel; retry + run-now
  endpoints; canceled jobs dropped from archive
- character bible: expand to role / physicalDescription / personality /
  background to drive image-gen prompts
- day-mode CSS: remap hover:text-white variants so summaries stay
  readable in light theme

* address review: fix const reassign, archive canceled jobs, correct parser comment

- mediaJobQueue.__resetForTests: clear codexRunning in place (was reassigning a const → TypeError + broken findJob)
- mediaJobQueue: archive canceled jobs (queued- and running-cancel paths) so /api/media-jobs?status=canceled and the recent-reel UI keep working within the 24h TTL
- comicScriptParser header: '(none)/empty' values normalize to '' for caption/sfx and [] for dialogue, never null

* address review: retry temp-upload guard + comicPages route tests

- mediaJobs route: retry now 409s with JOB_RETRY_TEMP_UPLOAD when the
  original job referenced uploadedTempPath/uploadedTempPaths/audioFilePath,
  since the gen modules unlink those files on completion/failure (retrying
  would fail or act on a stale path).
- Add mediaJobs.test.js covering retry happy path + 404 + JOB_NOT_TERMINAL
  + JOB_RETRY_TEMP_UPLOAD (all three temp-upload param shapes).
- Add pipeline route tests for comicPages/extract-pages (400 missing source,
  409 not-empty without force, success persists pages, force=true replaces)
  and comicPages/pages/:pageIndex/render (400 bad index, persist imageJobId
  on target page, out-of-range returns bare result).

* address review: comic-page render route + UX fixes

- ComicPagesStage: skip two-click arm on a fresh stage (no destructive
  replace to guard) — first click runs the extractor immediately.
- pipeline.js: comicPages page-render route now validates pageIndex
  exists up front and returns 404 PIPELINE_COMIC_PAGE_NOT_FOUND instead
  of letting the service's generic throw bubble through. Removes the
  dead 'persist skipped' fallback branch.
- pipeline.test.js: fix off-by-one prompt expectation (pageNumber is
  pageIndex + 1, so /pages/1/render renders page 2); align mock with
  real service contract; update out-of-range test to assert the new
  404.

* address review: reset codex parallel limit + per-page render state

- mediaJobQueue __resetForTests now also resets codexParallelLimit to
  CODEX_PARALLEL_DEFAULT so a test calling setCodexParallelLimit doesn't
  leak its value into the next test.
- ComicPagesStage tracks in-flight page renders as a Set<pageIndex>
  rather than a single index — Codex can now render multiple pages in
  parallel, so the old single-slot state would flip the spinner off the
  wrong button when one of the concurrent requests finished first.

* address review: ServerError + safer resolveMode in visualStages

- enqueueVisualComicPage / enqueueVisualImage throw ServerError with
  stable codes instead of plain Error, so the route layer returns the
  right 4xx (400/404) for bad-input / missing-page / no-panels / empty-
  prompt failures rather than a generic 500.
- resolveMode reads settings.imageGen?.mode through optional chaining
  on both the check and the true-branch, so a settings.json without
  imageGen no longer hits a "cannot read mode of undefined" path.

* address review: run-now route tests + drop dead empty-prompt check

- mediaJobs.test.js: cover /:id/run-now happy path (200 running),
  NOT_CODEX rejection (400), and NOT_FOUND (404).
- visualStages: enqueueVisualComicPage no longer throws
  PIPELINE_COMIC_PAGE_EMPTY_PROMPT — composeComicPagePrompt only
  returns '' when panels.length === 0, which the no-panels guard
  already rejects upstream. The "continuation of previous beat"
  fallback fills in for panels with no description, so the branch
  was unreachable.

* address review: guard startLaneJob splice against indexOf=-1

queue.splice(queue.indexOf(job), 1) would silently drop the LAST queued
job if indexOf returned -1. The JS event loop makes a double-promote
race impossible today (drainLoop and runJobNow can't interleave inside
the synchronous portion of startLaneJob), but the guard removes a
footgun for future refactors.

* address review: template + state + queue fixes

- pipeline-{prose,comic-script,tv-script}.md: render the expanded
  character bible fields (role, physicalDescription, personality,
  background) with a Mustache inversion fallback to legacy
  {{description}} so series that haven't been re-saved still render.
- ComicPagesStage: panel textarea persists via a ref so onBlur reading
  the latest state can't miss the user's final keystroke when blur fires
  synchronously after onChange.
- mediaJobQueue startLaneJob: early-return when the job is no longer in
  queue (was promoted by a parallel path) rather than splicing -1 and
  double-starting.
- mediaJobQueue tests: add direct queue-level coverage for runJobNow —
  happy path past the parallel limit, NOT_CODEX rejection on GPU jobs,
  NOT_FOUND on unknown ids.
- comicScriptParser comment: point at data.sample/ (the authoritative
  template path) instead of the gitignored data/ copy.

* address review: merge world.negativePrompt + refresh stale comment

- enqueueImageJob now merges the user-supplied negativePrompt with the
  world's negativePrompt (token-deduplicated, comma-joined) instead of
  letting the user override silently drop the world's global avoids.
  Mirrors composeStyledPrompt's preset-negative handling.
- pipeline.js comicPages render route comment refreshed — the service
  now throws ServerError directly, so the up-front check is about
  skipping wasted enqueue work, not about translating a generic Error.

* fix(media-queue): drop original row on retry to prevent duplicate stacks

Before this, hitting Retry on a failed/canceled job re-enqueued the work
correctly but left the dead row sitting in the recent-reel with its own
Retry button. Clicking that button again would enqueue another duplicate
job, and the failed list kept growing with rows for the same underlying
work.

- New mediaJobQueue.removeArchivedJob(jobId) prunes a terminal job from
  the archive and clears any lingering SSE entry.
- /:id/retry calls it after enqueueing the replacement job — the new
  job inherits the work, so the old row's only purpose was as a footgun.
- MediaJobsQueue does an optimistic local prune so the row disappears
  immediately instead of after the next 3s poll.

* feat(media-queue): per-row delete on failed/canceled history

Adds a trash icon to terminal-state rows (failed/canceled/completed) in
MediaJobsQueue so users can prune stale history one row at a time. Live
jobs (queued/running) keep the existing Cancel button — Delete is gated
to terminal rows only.

- New DELETE /api/media-jobs/:id route reuses removeArchivedJob; 404 on
  unknown id, 409 if the job is still live.
- deleteMediaJob API wrapper.
- JobRow renders the trash button next to Retry for terminal rows; click
  optimistically removes the row from local state and the server prunes
  it from the archive + SSE registry.

* feat(media-queue): show model in row header

Each row's title line now shows engine + model (e.g. "image · codex /
gpt-image-1" or "video · local / mochi-1-preview") so the failed /
canceled list tells the user which model died, not just that an image
job died. Tail-trims long HF repo paths and keeps the full id in a
title-attribute tooltip.

* address review: reword negativePrompt merge comment (avoids → terms)

* fix(pm2): disable filewatch on portos-server to stop image-gen restarts

The image gen path writes the rendered file, a sidecar JSON, and (via
atomicWrite) renames a tmp file over media-jobs.json on every job
state change. Even with watch scoped to server/ and a broad
ignore_watch list, chokidar would occasionally race on the atomic
rename target and trigger a SIGINT 5–30 s after a successful render —
killing the next in-flight job (e.g. SIGINT propagated to a live
mflux-generate child, leaving the user staring at a "Killed by signal
SIGINT" Python traceback).

Watch was always best-effort dev convenience here. Code edits already
require a manual `pm2 restart ecosystem.config.cjs` to take effect
(per CLAUDE.md — pm2 restart doesn't rebuild the client, so a manual
restart is in the loop anyway). Flipping watch off is a zero-cost
guarantee that no file write inside the repo can ever kick the
server.

* feat(media-queue): edit-then-retry on failed/canceled rows + SIGINT diagnostics

- Inline ✎ pencil button on each terminal row reveals an edit form
  with prompt / negative / model / width / height / steps fields
  pre-filled from the original params. "Retry with changes" submits
  only the keys the user touched.
- POST /:id/retry now accepts optional body.params overrides. A Zod
  schema whitelists user-facing fields (prompt, negativePrompt, model,
  modelId, width, height, steps, guidance/cfgScale, seed, numFrames,
  fps); anything else gets stripped, so a user can't slip pythonPath
  or other internal config through the retry path. Whitelisted
  overrides merge over the original params, non-listed internal
  fields (codexPath, session ids, etc.) ride through unchanged.
- Server shutdown log now records pid / ppid / tty / pm_id /
  pm_exec_path so next time a SIGINT lands we can tell whether PM2,
  a TTY, or something else fired it.

* address review: dialogue formatter, prune warning, server-sourced bounds, explicit cancelAll

- visualStages comic-page prompt: trim character/line BEFORE the
  truthy guard so whitespace-only fields don't emit malformed `: ""`
  dialogue entries; drop rows whose line is empty (nothing to render).
  The old `includes(':')` filter was a no-op.
- mediaJobs /retry: log a warning when removeArchivedJob returns
  false instead of silently masking a duplicate-history bug.
- ImageGen settings parallel limit: server now returns
  imageGen.codex.parallelLimitBounds { min, max, default } from
  /api/settings; client reads those and uses local constants only
  as a first-paint fallback. No more drift if the cap changes.
- codex.cancel(jobId) now requires a jobId — with the parallel lane
  it was previously possible to nuke every in-flight render with a
  bare cancel(). New explicit codex.cancelAll() takes the bulk
  "stop everything" path that imageGen.cancel() (no-arg dispatcher
  used by /api/image-gen/cancel) routes through.

* fix(media-queue): parallel-codex timeouts + listener cap

Two related bugs surfaced when running multiple Codex renders in parallel:

1. CODEX_TIMEOUT_MS (child) and WATCHDOG_IMAGE_MS (queue) were both 5
   minutes — sized for local MLX which emits regular tqdm progress to
   reset the idle timer. Codex emits NOTHING until completion, so its
   idle window IS its full render time. With parallel codex renders
   sharing OpenAI throughput a single generation can blow past 5 minutes
   and the watchdog kills it ("watchdog timeout: no runner output for
   300s"). Now:
   - codex.js uses a dedicated 20 min cap, CODEX_TIMEOUT_MS env override.
   - mediaJobQueue adds WATCHDOG_CODEX_MS (20 min default,
     MEDIA_JOB_WATCHDOG_CODEX_MS override) and routes codex jobs to it.

2. imageGenEvents emitter hit Node's default MaxListeners cap of 10
   ("MaxListenersExceededWarning: 11 progress listeners added"). Each
   in-flight job attaches ~6 listeners; CODEX_PARALLEL_MAX = 10 means
   60+ live during steady-state. Listener pairs already detach
   deterministically in runJob's terminate() and final detach(), so this
   isn't a real leak — just need to raise the cap. 200 leaves headroom
   for short overlap during job churn. videoGenEvents bumped to 50 for
   the same robustness even though video runs serially.

* address review: drain SSE clients on archive prune + drop empty model overrides

- mediaJobQueue.removeArchivedJob now ends any SSE clients still attached
  within the SSE_CLEANUP_DELAY_MS grace window before deleting the
  sseJobs entry. The previous bare delete left pending closeJobAfterDelay
  timers with nothing to drain, leaking the HTTP response.
- /retry override schema: model/modelId go through emptyToUndef so a
  cleared field collapses to undefined; the route then filters undefined
  override values before merging, so a user clearing the modelId input
  re-enqueues with the original modelId instead of "" (which would
  later fail with "Unknown or unsupported model").
- New test pins the keep-original-on-empty-override semantics.

* address review: surface codex model in queue UI + de-double comic-page punct

- /api/media-jobs PARAM_ALLOWLIST now includes \`model\` so the
  MediaJobsQueue row's modelLabel() can show the codex model name
  (e.g. \"codex / gpt-image-1\") instead of falling back to a generic
  \"codex\" badge. The codex provider uses params.model where local /
  video providers use params.modelId — both were already in
  modelLabel(), but the server was stripping the codex one out.
- composeComicPagePrompt no longer hard-appends \`.\` to fields that
  already end in sentence punctuation. Prose extracted from a script
  often carries its own period, and double-punctuating like
  \"...sunstreaming in..\" was noisy in prompts. New endPunct() helper
  uses /[.!?]\"?\$/ so quoted dialogue/caption text terminated inside
  the quotes (\"…away.\") also doesn't pick up an extra outer period.
- Two new tests pin the no-double-punctuation behavior + the still-
  appends-when-missing case.

* address review: clean up audioFilePath alongside other staged uploads

Queued-cancel, boot-recovery, and the runJob spawn-failure path all
called safeUnlinkUpload on uploadedTempPath / uploadedTempPaths but
left audioFilePath in place. For a2v / voice-driven video jobs that's
the same kind of multipart-staged file the route handed us, so a
cancel-before-start (or a restart that interrupts a running job) would
leak the audio file under data/uploads. Mirror the cleanup in all
three sites.
* feat(media): Refine Prompt modal + parallel Codex render lane

Lightbox now offers a Refine Prompt action that takes plain-English
feedback and rewrites both prompts via any enabled AI provider (API
or CLI), then queues the refined render with the original's settings.

The Codex image lane is no longer single-slot — runs up to four jobs
concurrently (MEDIA_CODEX_CONCURRENCY env override) since Codex calls
OpenAI out-of-process. Queue UI labels Codex jobs as `codex` so the
lane split is visible.

* address review: integer cap, disabled-provider guard, models[] fallback, renderConfig size limit, guidanceScale=0 round-trip

- MAX_CODEX_CONCURRENT now Math.trunc-clamped so '2.7' caps at 2, not 3
- refiner rejects providers with enabled === false (PROVIDER_DISABLED)
- model resolution falls through to provider.models[0] like stageRunner
- renderConfig schema enforces 4 KB serialized cap to bound LLM token cost
- PromptRefineModal uses ?? for guidanceScale so deliberate 0 survives

* address review: byte-accurate size cap, ARIA dialog semantics, env-pinned concurrency test

- renderConfig cap now uses Buffer.byteLength(s, 'utf8') so non-ASCII
  values count toward the limit accurately
- PromptRefineModal section gains role=dialog + aria-modal + aria-labelledby;
  backdrop marked aria-hidden so screen readers see only the dialog
- parallel codex test pins MEDIA_CODEX_CONCURRENCY=4 + re-imports so a
  CI/shell override of 1 can't make the assertion flaky

* address review: drop wrapper aria-hidden, restore Pipeline Arc bullet in changelog

- Backdrop wrapper had aria-hidden=true which hid the entire dialog
  subtree from assistive tech; the dialog's own aria-modal=true is
  sufficient. Revert to role=presentation on the wrapper.
- NEXT.md collided when I added new entries above the in-progress
  PipelineSeries Arc entry — restored the bullet prefix and moved the
  new Codex/MediaCard entries into the existing Changed section so
  there's no orphaned paragraph and no duplicate section heading.

* address review: scope CLI model override to codex, redact raw response from parse-failure log

- runRefinePrompt previously cloned every CLI provider with the user-selected
  model as defaultModel, but runner.js#buildCliArgs only honors that for codex
  (claude-code/gemini-cli ignore it). Override only when it'll take effect;
  PLAN.md tracks fixing buildCliArgs to handle all three CLIs.
- Parse-failure log no longer includes 400 chars of raw LLM response — that
  body can contain user prompts. Log only error reason + response size; the
  full response is already persisted to data/runs/<runId>/output.txt.

* address review: guard renderConfig size check against JSON.stringify failures

z.record(z.any()) accepts BigInt and circular refs that JSON.stringify
rejects. Wrap the size computation in try/catch so a bad payload surfaces
as VALIDATION_ERROR (400) instead of an unhandled 500.

* address review: drop duplicate toasts, let provider lookup errors bubble

- PromptRefineModal previously caught and re-toasted errors that
  services/apiCore#request() had already surfaced. Removed those redundant
  toasts; left the video-path toast since generateVideo uses raw fetch and
  doesn't auto-toast.
- Removed .catch(() => null) on getProviderById so genuine failures (e.g.
  providers.json read error, toolkit-not-initialized) propagate as 5xx
  through the centralized error handler instead of masquerading as 404
  PROVIDER_NOT_FOUND. getProviderById's own null-return is what 404 maps to.

* address review: return effective model for non-codex CLI; correct changelog

- refineMediaPrompt now resolves selectedModel to the runner's actually-
  effective model: per-call override for API providers and codex, but
  provider.defaultModel || provider.models[0] for claude-code / gemini-cli
  (where buildCliArgs ignores the per-call model). The API response,
  createRun metadata, and log line no longer claim a model that didn't
  run. Added a test covering claude-code's behavior.
- Changelog Codex parallelism bullet was written against my env-based
  implementation that the merge with main superseded. Dropped it — main
  shipped settings-driven parallelism (imageGen.codex.parallelLimit) and
  already documented it. Queue UI also shows kind=image with a separate
  codex/<model> badge, not lane label.

* address review: stop backdrop bubble, trim strings, user-facing changelog

- PromptRefineModal backdrop click now calls stopPropagation before onClose.
  Without it, the click bubbled to MediaLightbox's own backdrop handler and
  closed the lightbox underneath.
- refinePromptSchema strings now trim before min(1) so a whitespace-only
  prompt/feedback fails validation with a clear 400 instead of slipping
  through and surfacing the cryptic 'LLM returned an empty prompt' error.
- Refine Prompt changelog entry rewritten in user-facing terms per
  .changelog/README.md — dropped routes, file paths, and internal error
  codes; kept what the user can do and what to expect.

* address review: trim+normalize model, label the feedback textarea

- refinePromptSchema.model now trims and normalizes empty/whitespace to
  undefined so a '   ' value can't slip past MODEL_REQUIRED and reach the
  provider as a bogus model string. Mirrors the emptyToUndef transform
  the retry schema already uses.
- Added a visible <label htmlFor='prompt-refine-feedback'> tied to the
  feedback textarea (and matching id) so screen readers have a stable
  accessible name independent of the placeholder text.

* address review: surface runId in parse-failure log, accurate toast for sync image-gen, consolidate changelog sections

- runRefinePrompt now returns { text, runId } and the parse-failure log
  includes the runId + a direct pointer to data/runs/<runId>/output.txt
  so operators can locate the full response on disk.
- queueRender toast distinguishes external (synchronous) image-gen from
  queued local/codex/video — external returns the filename immediately
  with no queue position, so 'Render complete' is more accurate than
  'Render queued' for that path.
- NEXT.md had duplicate ## Added and ## Changed headers accumulated from
  prior PRs. Merged each into a single canonical section while preserving
  bullet order — release notes now have one place for each category.

* address review: drop hardcoded artifact path, size-check with pretty-printed JSON

- mediaPromptRefiner parse-failure log no longer prints a hardcoded
  data/runs/<runId>/output.txt path. The runner's data dir is configurable
  so the path can be wrong; the runId alone lets the Runs UI / tooling
  locate the artifact.
- refinePromptSchema's renderConfig size check now uses
  JSON.stringify(obj, null, 2) — the same pretty-printed format the
  refiner inlines into the LLM prompt. Minified measurement under-counted,
  so a payload that passed the 4 KB cap could still expand past it once
  indented and inflate the prompt.
Tailwind v4 dropped the default cursor: pointer on <button>, so every
clickable in PortOS would hover with the text I-beam unless the
component explicitly set cursor-pointer (96 places did, most didn't).

Adds a single global rule in client/src/index.css covering button,
a[href], summary, select, clickable input types, <label> wrapping a
checkbox/radio, and ARIA roles (button, link, tab, menuitem, option,
checkbox, radio, switch). Placed before the existing
[disabled] { cursor: not-allowed } block so disabled elements keep
the correct override.
- Refine prompts modal — rewrite starter/style/negative together from a plain-English nudge, with inline rationale, editable result, and auto-save back to a saved world.
- Per-element Render buttons — one inline button per composite board, per category item, and per category header. All three reuse the page render-options panel and the same job-queue + collection-tagging path as the full batch render.
atomantic and others added 22 commits May 12, 2026 08:32
…ghtboxes

World Builder batch render now reuses a single `World: <name>` collection
per world instead of minting a date-suffixed bucket per run. New
findOrCreateCollectionByName() helper does the case-insensitive trimmed
lookup. A one-shot migration in data/migrations/001 merges existing
date-suffixed collections (dedupes items, preserves earliest createdAt,
rewrites world-builder.json runs[].collectionId).

Clean (Light / Aggressive) buttons now appear in Media History and Media
Collection Detail lightboxes, not just Image Gen. Cleaning from inside a
collection auto-adds the cleaned file to that collection. Video Gen's
lightbox only renders videos, so it intentionally doesn't expose Clean.

Gitignore: allow data/migrations/*.js (source-controlled scripts) while
keeping runtime data and the generated .applied.json ignored. Overrides
the global ~/.gitignore `data/` rule so git can descend into data/ and
evaluate the migrations re-include.
Star toggle on every MediaCard plus an editable note in the lightbox (max
2000 chars, 500 ms debounced save) — shared across Media History, Image
Gen, Video Gen, and Media Collection Detail. Each page gets a Favorites
filter chip that scopes the grid to starred items; press `s` in the
lightbox to toggle.

Annotations live in a new sidecar data/media-annotations.json keyed by
`<kind>:<ref>` (same scheme as collections) so they survive when the
underlying job records get pruned. Empty entries (!starred && !note) are
removed on write to keep the file lean. Server-side: new mediaAnnotations
service + route (GET / PATCH with prune semantics) plus a shared
mediaItemKey.js lib that centralizes ITEM_KIND / REF_MAX_LENGTH /
parseKey and is now consumed by both mediaCollections and the new
annotations service. Client-side: useMediaAnnotations hook with
optimistic update + revert on error, FavoritesFilterChip shared
component, MediaCard / MediaLightbox prop extensions for star + note.
Add the existing folder+ AddToCollectionMenu trigger to the lightbox so
users can file an image or video into a collection from the larger
preview without closing it. Card and lightbox both share one popover
component via a new size prop ('sm' default for the MediaCard action
row, 'md' for the lightbox footer) — no styling duplication.

Also: gitignore browser/data/ so the CDP profile cache (created by the
browser service at runtime) isn't tracked.
…ong names

MediaCollectionDetail cards were suppressing the folder+ menu (the same
trigger the rest of the app exposes), so once an item was filed there
was no way to move or copy it elsewhere from the collection view. Drop
the override and let the shared MediaCard default render the menu —
users can now toggle membership across any collection from inside any
collection.

Also: replace truncate with break-words on the picker rows so long
collection names wrap to multiple lines instead of getting clipped
with ellipsis.
…ion + per-card file delete

A Select button in the collection-detail header flips the grid into
selection mode. Click cards to multi-select, then act on the lot via a
sticky toolbar: Star/Unstar, Move…/Copy… to a target (popover anchored
to the button, with inline create-new), or Remove from the current
collection (unfile without deleting). Move/Copy tracks which items
actually placed in the target before removing from the source, so a
failed add doesn't lose data.

A synthetic 'Unsorted' collection now pins to the top of
/media/collections — every image or video not in at least one real
collection. The detail view at /media/collections/unsorted shares the
grid + lightbox + bulk-action bar; Move… is relabeled File… (no
source-side remove), and Copy/Remove/cover/rename are hidden since they
don't apply to a synthetic source. Fully client-side via
buildUnsortedCollection(collections, images, videos) — no new server
route.

Per-card trash in a collection now deletes the underlying file (matches
MediaHistory). To unfile without deleting, uncheck the current
collection via the folder+ menu or use bulk Remove in select mode.

Also folded in the earlier session: a portal Add-to-collection button
in MediaLightbox so users can file from the larger preview; size prop
on AddToCollectionMenu so card + lightbox share one popover; wrap long
collection names in the picker; browser/data/ gitignored.
…hared helpers (#219)

`scripts/flux2_macos.py` and `scripts/z_image_turbo.py` carried ~200 LOC
of byte-for-byte-duplicated helpers: `pick_device`, `make_generator`,
`apply_memory_optimizations`, `write_sidecar`, `make_stepwise_callback`,
`_emit_user_error`, `_repo_from_hf_error`, and the entire bottom-of-file
HF cause-chain walker.

Pulled into a new `scripts/_runner_common.py` (mirrors the existing
`lora_utils.py` extraction). `make_stepwise_callback` takes an optional
`unpack_latents` callable so the flux2 packed-transformer-latent path
stays runner-specific while the rest of the decode is shared.
`install_hf_error_handler` is a `@decorator` over `main()` that owns
the gated/notfound/401 dispatch — the trailing `if __name__ ==
"__main__":` block in each runner collapses to a single `main()` call.

Smoke: both runners' `--help` parses cleanly; server test suite
4012/4012 green.
…, spawn/completion error recovery (#218)

* test(cos,agentLifecycle): coverage for evaluateTasks, dequeueNextTask, spawn/completion error recovery

Add `server/services/cos.test.js` (15 tests) pinning `evaluateTasks`
priority ordering and `dequeueNextTask` capacity guards. Extend
`agentLifecycle.test.js` (+17 tests, 4 skipped) covering
`spawnAgentForTask` cleanupOnError paths and `handleAgentCompletion`
error-recovery regression pins. Tests follow the inline-function-copy
pattern from `subAgentSpawner.test.js`. Source-level regression guards
pin priority order, the `availableSlots <= 0` short-circuit, and the
`runnerAgents.delete(agentId)` tail placement.

4 skipped tests document the gap covered by the PLAN.md backlog item
"Widen spawningTasks try/finally in agentLifecycle.js#spawnAgentForTask"
— they flip green once that fix lands.

* test(cos): brace-depth function extraction for source-level guards

Replace fixed 8000-char slices in cos.test.js's source-level regression
guards with an extractFnBody() helper that walks the function body via
brace-depth counting (skipping string literals + comments). The old
slice could silently drop the Priority 4 marker past the window if
dequeueNextTask grew, making the priority-order assertion pass on an
empty/short match instead of catching the regression.

Per Copilot PR #218 thread on line 409.

* test(cos): pin strict idle gate + handle template/regex literals

Address two Copilot PR #218 findings:

1. priorityDequeue replica was looser than production — it allowed idle to
   spawn after autoSystem/mission had already produced spawns. Production
   gates idle behind `spawned === 0` (cos.js:2480) and
   `tasksToSpawn.length === 0` (cos.js:862). Fix the replica, correct the
   "mission + idle fire when no user tasks" test expectation, add two new
   tests pinning the strict-idle behavior, and add a source-level guard
   that asserts the production fences are still in place.

2. extractFnBody helper now properly skips template literals (with nested
   ${...} interpolation tracking via a stack) and regex literals — both
   appear in evaluateTasks/dequeueNextTask and previously their braces
   could unbalance the depth scanner. Update the doc comment to match.

* test(cos,agentLifecycle): loosen style-only regex, anchor on code markers, sharpen comment

Address three Copilot PR #218 findings:

1. cos.test.js:585 — early-return source guard's regex required a literal
   `return;` with semicolon, which would trip on a formatting-only edit
   (no semicolon, braced if). Loosen the regex to make optional semicolons
   and optional braces tolerated so this stays a behavioral check, not
   a style check.

2. cos.test.js:607 — priority-order source guard keyed off the comment
   text `Priority 0`..`Priority 4`. A comment rewording would fail
   the test even with unchanged behavior. Anchor on actual code markers
   instead: `onDemandRequests`, `pendingUserTasks`, `autoApproved`,
   `generateMissionTasks(`, `generateIdleReviewTask(`.

3. agentLifecycle.test.js:36 — header comment said a leaked runnerAgents
   entry was "blocking any later completion event". That overstates
   the impact; the direct observable consequence is an unbounded
   in-memory leak plus potential mis-routing if the runner re-emits.
   Reworded to describe the actual symptom.

* docs(changelog,cos): fix test counts in NEXT.md + lowercase typo in cos.test.js header

- NEXT.md: cos.test.js is 18 tests (not 15); agentLifecycle.test.js
  is +15 (11 active + 4 skipped), not +17. Also note the new
  spawned===0 / tasksToSpawn.length===0 idle fence in the description.
- cos.test.js header: "exercise IT" → "exercise it" (lowercase pronoun
  was being read as a typo).
… voice stage nav + dynamic sidebar (#224)

* feat(pipeline): scene-video render + AI prompt refine + live thumbs + voice stage nav + dynamic sidebar

Knock out five of the deferred pipeline follow-ups in one pass:

- Live per-panel/scene image progress: new useMediaJobProgress hook subscribes
  to socket image-gen:* / video-gen:* events filtered by jobId and hydrates
  from GET /api/media-jobs/:id on mount. New MediaJobThumb component renders
  the live currentImage preview during diffusion and the final image (or
  <video> with poster for video kind) once complete. ComicPagesStage and
  StoryboardsStage now show per-render thumbnails inline.

- Storyboards scene-video path: enqueueStoryboardSceneVideo + POST
  /api/pipeline/issues/:id/stages/storyboards/scenes/:index/video render a
  single scene as a t2v clip via mediaJobQueue. Server persists
  sceneVideoJobId on the scene. UI gets a "Scene video" button + video-kind
  MediaJobThumb tied to the new field. Aspect-ratio dims pull from the
  existing ASPECT_PRESETS so the picker matches EpisodeVideoStage.

- Voice stage advancement: pipeline_next_stage / pipeline_prev_stage /
  pipeline_open_stage tools parse the current /pipeline/issues/:id/:stage
  path from ctx.state.ui.path and push a navigate side effect. New
  `pipeline` group in TOOL_GROUPS + GROUP_INTENT regex so phrasings like
  "next stage" and "open storyboards" route correctly. Stage list imported
  from issues.js#STAGE_IDS to avoid drift; aliases table handles spoken
  variants (teleplay, pages, scenes, video, etc).

- Dynamic Pipeline sidebar children: Layout fetches GET
  /api/pipeline/issues/recent?limit=10 on mount + on focus (30s debounce,
  signature-guard against no-op re-renders) and renders recent issues as
  one-level grandchildren under Create > Pipeline. activePathPrefix keeps
  the row highlighted across stage tabs.

- AI-assisted panel/scene prompt refine: two new templates
  (pipeline-comic-panel-image-prompt.md, pipeline-storyboard-image-prompt.md)
  registered in stage-config.json. refineComicPanelPrompt and
  refineStoryboardScenePrompt route through runStagedLLM (returnsJson: true)
  and replace the persisted description. Shared runPromptRefine helper +
  loadRefineContext slim loader keep the two paths DRY. "AI: refine" button
  on every comic panel + storyboard scene.

Tests: 3969 server tests + client build all green.

* address review: regex coverage, modelId guard, recent-issues cap, tests

- Voice GROUP_INTENT.pipeline regex: shared stage-name alternation across
  open/go-to/back-to so "open prose", "back to storyboards", "go to comic
  pages" route correctly without requiring the word "stage". Generic "take
  me to pipeline" still falls through to ui_navigate.

- enqueueStoryboardSceneVideo: validate modelId against getVideoModels()
  before enqueue. Unknown ids now 400 PIPELINE_UNKNOWN_VIDEO_MODEL instead
  of leaving a doomed entry in mediaJobQueue. Matches the /api/video-gen
  fail-fast pattern.

- /api/pipeline/issues/recent: new issuesSvc.listRecentIssues() sorts the
  FULL issue set by updatedAt desc before slicing to limit, so the sidebar
  doesn't silently miss the latest items once the dataset grows past
  ISSUES_PER_RESPONSE_MAX (1000). listIssues continues to sort by
  seriesId/number for the per-series endpoints.

- Tests: 15 new voice-tools tests cover parsePipelineIssuePath (query/hash
  stripping + missing UI state) and the three pipeline_*_stage tools
  (alias resolution, first/last-stage guards, classifyIntent regex
  coverage). 17 new visualStages tests cover enqueueStoryboardSceneVideo
  (bad-index, missing-scene, empty-description, unknown-model, missing-
  pythonpath, happy path) and refineComicPanelPrompt /
  refineStoryboardScenePrompt (bad-index, missing-page/panel/scene,
  empty-description, empty-LLM-response, happy paths).

Server tests: 4001 passing (was 3969, +32).

* address review (round 2): regex alias coverage, route + service tests

- GROUP_INTENT.pipeline broadened to include the spoken aliases that
  PIPELINE_STAGE_ALIASES already resolves: teleplay, comics, comicpages,
  pages (bare), scenes, episodevideo, episode, video, story, etc. Without
  these in the regex, the pipeline tool group was never exposed to the
  LLM for those utterances, so the alias resolution in pipeline_open_stage
  was unreachable.

- listRecentIssues clamping: rewrote the limit coercion in two passes
  (`Number(...)` finiteness check first) so `limit: 0` now clamps to 1
  instead of falling through `0 || 10` and returning 10.

- New tests:
  - issues.test.js: 3 describes covering listRecentIssues ordering
    (descending by updatedAt across series), limit clamping (0 → 1,
    negative → 1, 999 → 50, non-numeric → default 10), and upper-bound
    enforcement.
  - tools.test.js: alias-intent expectations (teleplay, comics, pages,
    scenes, episode, video, story all trigger the pipeline group).
  - routes/pipeline.test.js: route happy-paths + validation for the
    three new endpoints (GET /issues/recent, POST .../scenes/:i/video,
    POST .../panels/:n/refine-prompt, POST .../scenes/:i/refine-prompt).

- pipeline.test.js mock for visualStages.js now stubs the new exports
  (enqueueStoryboardSceneVideo, refineComicPanelPrompt,
  refineStoryboardScenePrompt) so the new route tests can dispatch
  through the mocked service.

Server tests: 4010 passing (was 4001, +9 from round 1; +41 from baseline).

* address review (round 3): route/service limit alignment, "comic page" alias, header comments

- Route /issues/recent: drop the `Number(req.query.limit) || 10` coercion
  in the route — that turned explicit limit=0 into 10 because 0 is falsy,
  diverging from listRecentIssues which clamps 0 → 1. Route now forwards
  the raw query value; the service owns coercion (single source of truth).

- PIPELINE_STAGE_ALIASES: add "comic page" (singular) and bare "page" so
  the regex-matched utterances "open comic page" / "open page" actually
  resolve in pipeline_open_stage instead of bottoming out at "Unknown
  stage". The regex (`comic ?pages?`) accepts both forms but the alias
  table previously only listed the plural — out-of-sync.

- visualStages.js module header: rewrote to reflect current scope. The
  file is no longer "MVP scope: image jobs only" — it now also owns
  single-scene video enqueue and LLM-driven prompt refinement.

- StoryboardsStage.jsx header: replaced the "per-scene video is deferred"
  note with the actual four per-row actions (AI refine, Storyboard,
  Scene video, Trash) so future readers don't get misled.

- Tests: route limit=0 → 1-item response, comic-page / page alias
  resolution to comicPages.

Server tests: 4012 passing (+2 from round 2; +43 from baseline).

* address review (round 4): page-alias regex + canceled status UI

- GROUP_INTENT.pipeline: extend `pages` → `pages?` so bare "open page"
  triggers the pipeline group. Round 3 added `'page'` to
  PIPELINE_STAGE_ALIASES but missed mirroring it in the regex, so the
  utterance was still routed to ui_navigate before the alias resolver
  ran. classifyIntent test added.

- MediaJobThumb: add a `canceled` status branch (Ban icon + caption,
  port-text-muted border) so user-canceled jobs render an obvious
  terminal state instead of falling through to the running/queued
  spinner and looking stuck forever.

Server tests: 4012 passing.

* address review (round 5): canceled status route, signature breadth, dialogue filter

- useMediaJobProgress: cancellation flows through the mediaJobQueue as a
  *:failed socket event (the underlying SIGTERM looks like a failure to
  the gen module) while job.status persists as 'canceled'. The hook now
  re-fetches /api/media-jobs/:id inside onFailed and maps a final
  'canceled' status back to state, so MediaJobThumb's canceled UI is
  reachable in live sessions, not just on reload.

- Layout sigOf: extend the change-detection fingerprint to include the
  rendered display fields (number, title, seriesName), not just
  id@updatedAt. A series rename doesn't bump the child issue's
  updatedAt, so the prior signature could silently suppress sidebar
  re-renders and leave stale labels.

- refineComicPanelPrompt: filter dialogue rows whose `line` is empty
  before joining, matching the filter composeComicPagePrompt already
  applies. Stops the refine template from getting noisy `CHAR: ""`
  fragments when an empty dialogue row is present.

Server tests: 4012 passing.

* fix(media-jobs): silent 404 on /media-jobs/:id speculative lookups

MediaJobThumb's useMediaJobProgress hook fires `GET /api/media-jobs/:id`
on every mount to hydrate state. For panel/scene jobIds older than the
queue's 24h archive TTL the route returns 404, which previously routed
through the global error broadcast:

  - server: console.error('❌ Route error: Not found') + emitErrorEvent
  - client: useErrorNotifications fires `[NOT_FOUND] Not found` console
    error + global toast.error on every panel mount

Two-pronged fix:

- server/routes/mediaJobs.js: throw the 404 with `severity: 'warning'`.
  useErrorNotifications's warning branch logs via console.warn and skips
  the toast — same response body, no global UI noise. Other 404 paths
  (cancel-nonexistent, retry-nonexistent, etc.) still surface normally.

- client/services/apiMediaJobs.js: getMediaJob now passes `silent: true`.
  Only useMediaJobProgress consumes it — that hook owns its own state
  semantics (404 = expired, stay at status:'unknown') so the API toast
  was redundant. Other status-codes still surface via the centralized
  error handler.

Server tests: 4012 passing.

* address review (round 6): case-insensitive stage lookup + serialize refine clicks

- pipeline_open_stage canonical-id fallback: previously did
  `PIPELINE_STAGE_IDS.includes(stage)` on the RAW user value, so "Prose"
  with capital P returned "Unknown stage" despite being a valid canonical
  id. Use a case-insensitive lookup against PIPELINE_STAGE_IDS that
  compares against the already-normalized lowercase `key`, mirroring the
  alias-table lookup. Test added.

- ComicPagesStage + StoryboardsStage refine buttons: gate the disabled
  state on `refiningKey !== null` / `refiningIdx !== null` instead of
  matching the specific row. The single-scalar tracker doesn't prevent
  concurrent refines on different panels/scenes, which causes:
    1. Spinner moves to the latest click (visual confusion)
    2. Server-side updateStage races (read-modify-write on issues.json
       under concurrent in-flight refines can drop the first refine's
       update). Disabling all refine buttons during any in-flight one
       eliminates both classes without needing a mutex or per-row Set.

Server tests: 4013 passing.

* address review (round 7): serialize scene-video button clicks too

Same single-scalar pattern as the refine buttons in round 6 —
`renderingVideoIdx === i` only disables the matching row, letting a
second scene-video enqueue overwrite the tracker mid-flight. Gate on
`renderingVideoIdx !== null` so all Scene video buttons share the lock
during any in-flight request. Spinner still keys on the matching row.
…e, assertSafeFilename, listLoras rename, mediaModels drift, CLI defaultModel (#222)

* chore(server): Civitai/Z-Image follow-ups — RUNNER_FAMILIES, deepMerge, assertSafeFilename, listLoras rename, mediaModels drift, CLI defaultModel

Six items off the "Civitai LoRA / Z-Image-Turbo follow-ups" backlog:

1. mediaModels.js drift warning — loadMediaModels() now logs a loud
   `⚠️ media-models drift: built-in "<id>" was shipped but is missing
   from <kind>[]` warning at boot for any id in _shippedDefaults +
   DEFAULT_REGISTRY but missing from the user's live image/video.macos/
   video.windows lists. Catches the 2026-05-09 silent-drift class.
2. RUNNER_FAMILIES constants module — new server/lib/runners.js +
   client/src/lib/runnerFamilies.js mirror. All bare-string 'mflux' /
   'flux2' / 'z-image' / 'ernie' references in civitai.js, mediaModels.js,
   ImageGen.jsx, and Loras.jsx route through the frozen constant.
3. listLoras exports collision — services/imageGen/local.js#listLoras
   renamed to listLoraFilenames so the minimal {filename, name} shape
   can't be confused with services/loras.js#listLoras (rich Civitai
   shape).
4. Generic deepMerge utility — new server/lib/objects.js#deepMerge
   consolidates three hand-rolled call sites (voice/config.js,
   meatspacePost.js, routes/loras.js).
5. assertSafeFilename helper — new server/lib/fileUtils.js#assertSafeFilename
   consolidates assertSafeLoraFilename (.safetensors) and
   assertGalleryFilename (.png) into one extension-allowlist helper.
6. CLI defaultModel for all CLI providers — services/runner.js#buildCliArgs
   now appends `--model <id>` to claude-code and `-m <id>` to gemini-cli,
   gated on the provider's stored args not already pinning a model
   (both space- and =-joined forms). Codex behavior unchanged.

4007 server tests + 32 new tests across runners/objects/fileUtils/
mediaModels/runner test suites.

* chore(server): address Copilot review on PR #222

- assertSafeFilename: enforce each extension is a non-empty string starting
  with "." so a bare suffix like 'png' can't match 'not-an-imagepng' and
  weaken validation; treat misuse as a programmer error.
- assertSafeFilename: add optional `requiredMessage` override so wrappers can
  preserve their original missing-input error message without affecting the
  invalid-input wording.
- assertGalleryFilename: pass requiredMessage:'Invalid filename' to restore
  the historical missing-input message (matches pre-refactor behavior).
- assertSafeLoraFilename: pass requiredMessage:'Filename required' to restore
  the historical missing-input message (instead of 'LoRA filename required').
- mediaModels warnDrift: clarify the warning to acknowledge that intentional
  deletions are indistinguishable from drift, and tell the user how to
  silence the message (remove the id from _shippedDefaults.<where>) when
  the absence is intentional.
- Tests: extend fileUtils.test.js with cases for the leading-dot enforcement
  and the requiredMessage override.

* chore(server,client): address Copilot re-review on PR #222

- mediaModels warnDrift: fix the suggested key path (image lives at
  _shippedDefaults.image.list, video.macos / video.windows are arrays
  directly); also drop the false "silence without restore" claim since
  removing an id from _shippedDefaults flips it back to "newly shipped"
  and normalizeRegistry will re-add it on next boot. The warning now
  honestly states the drift will keep firing for intentional deletions
  and points users at the only real options (re-add manually or
  re-bootstrap the whole section).
- mediaPromptRefiner / worldBuilderRefine: stop Codex-only-gating the
  per-call model override. runner.js#buildCliArgs now honors
  provider.defaultModel for codex / claude-code / gemini-cli alike, so
  clone the provider with the user-selected model for any CLI that
  hasn't already baked a --model / -m flag into provider.args. Update
  the "honors override" computation to detect the baked-flag case via
  the same logic the runner uses, so the reported model still matches
  what'll actually run when the user-saved args win.
- mediaPromptRefiner.test.js: flip the legacy "ignores per-call override"
  case to assert the new override-honored behavior, plus add a new test
  that pins defaultModel when --model is baked into provider.args.
- client imageGenResolutions / ImageGenControls: convert remaining
  bare-string 'flux2' / 'z-image' / 'ernie' usages to RUNNER_FAMILIES
  constants so the client matches the server contract.

* chore(server): extract args-baked model id so reported model matches reality

Copilot follow-up: when a CLI provider has --model / -m baked into args,
buildCliArgs skips injection and the saved arg wins — defaultModel can
diverge, so reporting it on the run record misrepresents what actually
ran.

- runner.js: export hasModelFlag + add new extractBakedModel helper that
  parses the pinned model id out of args (separated and joined forms, both
  long and short flags). Use it from both refiners.
- mediaPromptRefiner / worldBuilderRefine: when cliHasBakedModelFlag is
  true, fall back to extractBakedModel(args) → defaultModel → models[0]
  for selectedModel so the response reflects the model the CLI will
  actually invoke.
- runner.test.js: unit tests for hasModelFlag (all 4 flag forms, non-array
  inputs) and extractBakedModel (extraction, null cases, first-wins).
- mediaPromptRefiner.test.js: extend the runner.js vi.mock with pure-ish
  copies of the new helpers so the refiner can run under the mock, plus
  flip the args-baked test to assert the extracted-from-args model is
  reported (not defaultModel) and add a joined-form (-m=X) variant.

* chore(server): address Copilot re-review on PR #222 (round 3)

- runner.hasModelFlag: only return true when a separated flag is followed
  by a real non-flag value, and skip empty joined forms (--model= / -m=).
  A bare '--model' at end of argv or followed by another flag would have
  made buildCliArgs skip injection AND made extractBakedModel return null,
  leaving the CLI to choke on an invalid argv. Now we leave injection on
  in that case so the broken argv gets a usable --model X appended.
- runner.test.js: extra coverage for the three "looks-like-flag-but-not-
  really" cases (end-of-argv, next-token-is-flag, empty joined).
- mediaPromptRefiner.test.js: keep the mock hasModelFlag in lockstep with
  the real impl so the new value-presence rule is reflected under the mock.
- fileUtils JSDoc: correct the requiredMessage example — gallery wrapper
  passes 'Invalid filename' (its pre-refactor impl threw that for every
  case), LoRA wrapper passes 'Filename required'.

* chore(server): address Copilot re-review on PR #222 (round 4)

- runner.buildCliArgs: sanitize provider.args before injecting our own
  --model flag. hasModelFlag intentionally returns false for dangling
  --model (no value) / --model= cases so the injection path fires; but
  if we kept the broken token in baseArgs, the spawned argv would end up
  with two --model occurrences (one broken, one valid) and the CLI would
  reject the invocation. New stripBrokenModelFlags helper drops only the
  dangling/empty forms; well-formed pins are preserved untouched.
- runner.test.js: 5 new cases covering the sanitizer (claude-code with
  bare --model, gemini-cli with -m followed by a flag, empty joined
  --model=, codex with bare --model, and the "well-formed pin survives"
  regression).
- objects.deepMerge: tolerate a missing/null/non-object base by treating
  it as `{}`. The function's own `base?.[k]` recursion guard already
  assumed missing bases were fine; the top-level spread didn't. Now
  `deepMerge(undefined, { a: 1 })` returns `{ a: 1 }` instead of
  throwing on spread.
- objects.test.js: regression test for null/undefined/non-object base
  values.

* chore(server): address Copilot re-review on PR #222 (round 5)

- runner.extractBakedModel: reject a value that looks like another flag
  (starts with '-') in the separated form, matching hasModelFlag exactly.
  Without this, the two helpers would disagree on `['--model', '--other']`:
  hasModelFlag would say "no baked model" while extractBakedModel would
  extract '--other' as the model id and refiners would mis-report it.
- runner.test.js: new case asserting extractBakedModel returns null when
  the next token is another flag.
- mediaPromptRefiner.test.js: switch from a hand-maintained re-impl of
  hasModelFlag / extractBakedModel inside vi.mock to a partial mock via
  importOriginal — the refiner tests now exercise the REAL canonical
  helpers and can't drift when those helpers evolve.
…unner sites + 3 JSON extractors (#221)

* refactor(server): unify promptRunner + jsonExtract — collapse 4 LLM-runner sites + 3 JSON extractors

- New server/lib/promptRunner.js#runPromptThroughProvider collapses
  worldBuilderExpand#callLLM, stageRunner#awaitRunnerCall,
  mediaPromptRefiner#runRefinePrompt, and messageEvaluator#runPrompt.
  Uses the strictest discriminator: rejects on success === false OR
  truthy error for BOTH CLI and API (was previously per-site drift —
  API soft-failures silently flowed through as empty text).

- New server/lib/jsonExtract.js promotes findBalancedBlocks +
  tryParseWithRepair + extractJson out of worldBuilderExpand. Three
  near-identical extractors collapse onto the shared lib with
  optional shapePredicate. Repairs covered: trailing commas, Codex
  '}}]' orphan-brace corruption, '[...]' placeholder elisions.
  mediaPromptRefiner keeps its tri-state placeholder-vs-real walker
  using findBalancedBlocks + tryParseWithRepair directly so the
  'schema placeholder' error message stays intact.

- 43 new unit tests (15 promptRunner + 28 jsonExtract). Full server
  pack stays green.

* address review: fix tryParseWithRepair null-vs-fail ambiguity + propagate concrete parse error

Copilot review on PR #221 flagged two issues:

1. extractJson treated parsed JSON null as a parse failure
   (parsed !== null && parsed !== undefined) — impossible to extract
   top-level null responses. tryParseWithRepair now returns a
   discriminated { value } | { error } object so null flows through
   as a real value.

2. extractJson surfaced a generic 'No matching JSON block found'
   error when JSON.parse actually failed — losing the concrete
   reason that callers like worldBuilderExpand surface in their
   ServerError context. The loop now captures the JSON.parse
   exception from tryParseWithRepair's wrapper and propagates it
   as lastError.

Updated three call sites that read the tryParseWithRepair return:
extractJson itself + mediaPromptRefiner.extractRefinementJson; also
updated 7 test assertions and added 2 regression tests
(top-level null + concrete-parse-error propagation). 4014 server
tests pass.

* address review: gate CLI defaultModel clone to codex + fix test wording

Second Copilot review on PR #221 flagged:

1. promptRunner.js CLI branch always cloned provider with
   defaultModel: model, which lied to executeCliRun's onRunStarted
   hook for non-codex CLIs (claude-code, gemini-cli). runner.js#buildCliArgs
   only translates defaultModel into --model for codex; other CLIs
   ignore it. Now the clone is gated on provider.id === 'codex' so
   the run record stays truthful for all three CLI providers. (PLAN.md
   tracks extending buildCliArgs to honor per-call model for all CLIs.)

2. jsonExtract.test.js had test description '`}}]` → `}]`' but the
   actual repair is '`}}]` → `}]}`' (brace-count-preserving swap, not
   a drop). Updated description to match implementation.

Added a regression test asserting non-codex CLI providers are NOT
cloned with the model override. 4015 server tests pass.

* address review: string-aware JSON repair + clean effectiveModel resolution

jsonExtract: trailing-comma, Codex `}}]`, and `[...]` placeholder repairs
now only fire OUTSIDE quoted JSON string values. A string containing
`,}` or `}}]` or `[...]` as content used to be silently corrupted by
the regex repairs. Walks input with the same string/escape awareness as
findBalancedBlocks and splits into code/string segments before applying
the replace. 3 new tests cover the string-aware behavior.

promptRunner: resolve `effectiveModel` once up-front so the run record
reflects which model actually executed. Adds `providerHonorsModelOverride`
predicate (API providers always honor; CLI providers honor only when
buildCliArgs supports per-call override — currently codex only). Resolves
{ text, runId, model } from the function now that the resolved model is
visible to the caller. Drops the dead 'clone defaultModel for non-codex
CLI' branch — for non-honoring providers effectiveModel === defaultModel
so the clone was always a no-op.

stageRunner / mediaPromptRefiner / messageEvaluator / worldBuilderExpand
updated for the new resolved-shape from promptRunner.

4018 server tests green.

* address review: align promptRunner docstring + drop redundant pre-gate in messageEvaluator

- promptRunner.js file-level doc now mentions the returned 'model' field
  alongside 'text' and 'runId', matching the actual JSDoc return type.
- messageEvaluator.runPrompt drops the redundant inline gating loop and
  the stale 'log line below' comment (there is no log line in this
  function). promptRunner already applies providerHonorsModelOverride
  internally, so passing 'model' through as-is is correct and the
  pre-gate was both dead and misleading. Removed the now-unused import.

* address review: propagate effective model + drop dead gemini-cli fallback + fix DONE.md

- messageEvaluator.runPrompt now returns {text, model} so triage/reply
  log lines report the actual effective model (post-gate) instead of
  echoing back a per-call model that promptRunner may have dropped for
  non-honoring CLI providers. Logs split into 'kicking off' (provider
  only) + 'ran on' (provider/effectiveModel) so the post-call line is
  honest about what executed.
- stageRunner: remove the gemini-cli lightModel fallback. Until
  buildCliArgs honors per-call model for all CLIs, the special case was
  immediately overridden by the providerHonorsModelOverride gate — pure
  dead code. Comment updated to note the fallback can come back once
  the gate widens (tracked in PLAN.md).
- DONE.md: correct the promptRunner entry to say it returns
  {text, runId, model} and bump test count 15 -> 16 to match the
  actual cases in promptRunner.test.js.

* address review: pick blockType by first JSON delimiter in stageRunner.extractJson

Previously extractJson always tried object-mode first, falling back to
array-mode. For an array-of-objects response like `[{"a":1},{"a":2}]`,
findBalancedBlocks with startChar='{' would lock onto the first inner
object `{"a":1}`, silently dropping the array wrapper — the array-mode
fallback never fired.

Now peeks at the first JSON delimiter (after stripping fences) and runs
the matching shape first, falling through to the other only if the
preferred walk yields no parseable block.

Added 3 regression tests: array-of-objects bare, array-of-objects inside
fence, and the still-works case where an object appears before an array
in prose.

* address review: pick earliest parseable block in stageRunner.extractJson

The previous fix used 'first JSON delimiter' to choose object-vs-array
mode, but a Codex CLI banner like '[workdir, /tmp]' precedes the real
JSON object and makes 'indexOf("[") < indexOf("{")' lie. Worse, when
arr-mode was chosen by that lying peek, jsonExtract could happily
return an inner array field of the object (e.g. {"a":[1,2]} → [1,2]).

Replaced the peek with an interleaved walk: collect balanced candidates
for BOTH '{' and '[' shapes, sort by source-text start position, parse
each in order, return the first that succeeds. Banner contents fail
to parse so they're skipped; the wrapping shape wins on earliest-start.

Added 2 regression tests:
  - 'OpenAI Codex CLI v2.1.0\\n[workdir, /tmp]\\n\\n{"a":1,"b":[2,3]}'
    parses as {a:1, b:[2,3]} (not [2,3] from the inner array field).
  - {"items":[1,2,3]} parses as the wrapping object.
Existing array-of-objects + leading-object-with-trailing-array cases
still pass under the new strategy.

* address review: validate promptRunner inputs + drop fragile first-fence grab + comment cleanups

- promptRunner: explicit argument validation at the top of
  runPromptThroughProvider — null/undefined provider, missing
  provider.id, bad provider.type, empty prompt/source now reject with
  clear messages instead of throwing a downstream TypeError on
  provider.id inside createRun. Existing 'Unsupported provider type:
  rpc' rejection wording preserved for backward-compat with the
  existing test. +3 new unit tests covering null provider, missing/
  empty provider.id, and missing/empty prompt/source (19 tests now).
- stageRunner.extractJson: drop the 'first fenced block' grab inside
  this helper. The previous heuristic locked onto fenced prompt-echo
  content from CLI runs (e.g. Codex) where the prompt itself may
  contain fenced JSON schema examples that appear BEFORE the model's
  response in the stream. Now strips only leading/trailing fences via
  stripCodeFences and lets findBalancedBlocks walk the full text —
  echoed schema examples either parse-and-match by source order or get
  skipped by tryParseWithRepair, but the real response is no longer
  silently bypassed. +1 regression test guarding the behavior.
- stageRunner.test.js: refresh the now-stale comment on the
  array-of-objects regression test so it describes the current
  earliest-parseable-block strategy instead of the obsolete
  'first-delimiter peek'.
- jsonExtract.js: fix the misleading 'greedy' wording on the inner
  fence match (the regex is non-greedy, '*?'); call out the
  prompt-echo failure mode explicitly so future callers know to
  prefer stageRunner.extractJson when echoes are likely.
- DONE.md: bump promptRunner test count 16 -> 19 to match the new
  validation tests.

* address review: tighten the stripCodeFences comment in stageRunner.extractJson

Previous comment said 'strip only when ENTIRE response is fenced', but
aiProvider.stripCodeFences strips leading and trailing fences
INDEPENDENTLY — a leading ```json without a closing fence still has
its opener removed. Comment now matches that actual behavior and
keeps the focus on the real reason this helper avoids the first-fence
heuristic (echoed prompt content on Codex CLI runs).

* address review: clarify providerForCli clone comment for the models[0]-fallback case

Previous comment said 'For non-codex CLIs effectiveModel === provider.defaultModel … the clone is a no-op'. That's true only when defaultModel is actually set. When defaultModel is unset but provider.models[0] is set, effectiveModel falls back to models[0], differs from the missing defaultModel, and the clone DOES fire (which is the desired behavior — it lets the run-started hook log a real model id instead of undefined). Comment now enumerates all three cases explicitly.

* address review: drop the contradictory extractJsonShared fallback in stageRunner.extractJson + fix jsonExtract test count

The previous version had a 'last resort' branch that called the shared
extractJson(text) — which is exactly the helper that grabs the first
inner ```…``` fenced block, the same prompt-echo failure mode this
implementation was rewritten to avoid. Removed the fallback (and the
now-unused extractJsonShared import) so the helper genuinely never
sees an echoed schema fence; surface the last candidate's parse
error so callers still get a concrete diagnostic instead of a
generic 'no JSON block found'.

Also bumped DONE.md's jsonExtract test count 28 -> 33 to match the
actual count of `it(...)` cases in server/lib/jsonExtract.test.js.

* address review: strip echoed prompt in stageRunner.extractJson to skip Codex CLI prompt-echo schema blocks

Codex CLI echoes stdin to stdout, so when a stage prompt contains a
fenced JSON schema example (e.g. data.sample/prompts/stages/pipeline-arc-overview.md),
both the schema and the model's actual response end up in the captured
text. The earliest-parseable-block strategy returned the schema —
propagating placeholder strings like 'string (the whole-series pitch…)'
into persisted data.

Fix: extractJson now accepts an optional { promptToStrip } that
removes every verbatim occurrence of the prompt from the input
before walking. runStagedLLM passes the prompt it just built so the
echo is silently dropped. Split-join is safer than regex (no escaping).

Updated the existing regression test to assert the new strip-first
behavior, and added a 'without promptToStrip' test that documents the
failure mode so future contributors who bypass the stripping path
know they have to provide it.
…odal chrome + sidecar field reads (#223)

* refactor(client): extract <Modal> + getRenderConfigForItem — dedupe modal chrome + sidecar field reads

Two more "Civitai LoRA / Z-Image-Turbo follow-ups" off PLAN.md:

(a) New client/src/components/ui/Modal.jsx (~110 LOC) owns the
fixed inset-0 z-50 bg-black/70 backdrop, Esc handler,
target-checked click-outside, panel-level stopPropagation (so a
Modal inside another Modal — refine inside lightbox — doesn't
dismiss the outer layer), ARIA, portal, size, and align flags.
Converted 9 hand-rolled call sites: Flux2InstallModal, EditAppModal,
MemoryEditModal, ResumeAgentModal, LayoutEditor, KeyboardHelp,
RapidReaderModal, DeployPanel, CivitaiAuthModal in Loras.jsx.
Per-site closeOnBackdrop / closeOnEsc / align / usePortal flags
preserve each modal's original dismiss semantics 1:1 — no visual
regressions. MediaLightbox stays standalone (prev/next viewport-edge
buttons + layered Esc cascade fight Modal's panel wrapper); inline
comment documents why.

(b) getRenderConfigForItem(item) added to
client/src/components/media/normalize.js. Lifts the sidecar
snake_case/camelCase fallback chain (cfgScale/cfg_scale,
loraFilenames/lora_filenames, loraScales/lora_scales,
guidanceScale/guidance_scale/guidance, disableAudio/disable_audio)
out of PromptRefineModal#getRenderConfig and into normalize so the
modal reads from item.* only. 11 unit tests in normalize.test.js
cover both naming conventions, the deliberate-0 guidanceScale
carve-out, and the camelCase-wins drift safety. Server's
vitest.config.js extended to include ../client/src/**/*.test.js
so client-side pure helpers stay covered.

Build green; full server suite still passes.

* fix(modal): stacked-Esc top-of-stack dispatch, backdrop propagation, ARIA, perf

Resolves Copilot review feedback on PR #223:

- Esc on stacked modals only fires the top-most modal's onClose. Replaced
  per-instance window listeners with a single module-scope stack + global
  capture-phase keydown that stopImmediatePropagation()s after dispatching
  to the top. Fixes the "one Esc closes Flux2InstallModal AND LayoutEditor"
  bug + "PromptRefineModal Esc also dismisses MediaLightbox" bug.
- Backdrop click on a non-dismissible modal (closeOnBackdrop=false) now
  always stopPropagation()s so the click can't bubble up to an ancestor
  overlay and dismiss that.
- DeployPanel: gate <Modal> behind {isOpen && ...} so the streaming
  output.map() child tree doesn't re-render while the modal is dismissed.
- MemoryEditModal / ResumeAgentModal: wired ariaLabelledBy to the title
  heading id so screen readers announce a named dialog.
- Flux2InstallModal: pass closeOnEsc={false} so a stray Esc during a
  multi-GB torch download can't SIGTERM pip mid-stream — matches the
  file-level "X / backdrop only" contract.
- ALIGN_CLASSES comment: clarified that Tailwind utility precedence is
  decided by CSS source order, not class-attribute order; reordered the
  className join so caller backdropClassName appears last (intent
  documentation) and updated the surrounding comment to spell out the
  override mechanics (arbitrary values / !important / different align).

* fix(modal): onEsc prop + always-register-on-stack for non-closing top layers

Round 2 of Copilot review on PR #223:

- Add `onEsc` prop to Modal. When provided, Esc on the top-most modal
  invokes `onEsc` instead of `onClose`. Lets LayoutEditor route Esc to
  setMode('idle') during an inline rename/delete/switch mode while still
  closing the editor at the outer level.
- Always register an open Modal on the Esc stack, even with closeOnEsc=
  false and no onEsc. The top-most layer now always absorbs Esc so the
  keystroke can't fall through and dismiss an underlying modal.
- KeyboardHelp: drop closeOnEsc={false}. Modal's capture-phase dispatcher
  is the canonical handler now; useKeyboardHelp's document bubble-phase
  Esc listener is preempted by Modal anyway.
- RapidReader comment: fix the "bubble-phase Esc" claim — Modal's
  listener is capture-phase. Both Modal and RapidReader register
  capture-phase window listeners; Modal's wins by registration order
  (mounted first), and the inner reader's Esc branch is dormant when
  wrapped in RapidReaderModal.

* fix(modal): bubble-phase Esc + honour defaultPrevented; DeployPanel onEsc

Round 3 of Copilot review on PR #223:

- Modal: switch the global Esc keydown listener from capture- to bubble-
  phase, and skip dispatch when event.defaultPrevented is true. Lets an
  inner widget that legitimately owns Esc (native <select> closing its
  dropdown, custom popovers/menus) consume the keystroke first by
  calling preventDefault(); only un-consumed Esc closes the modal.
- DeployPanel: wire onEsc={() => setDismissed(true)} to preserve the
  pre-refactor "Esc hides but never clears" semantics. The shared
  Modal's default Esc -> onClose path would have routed through
  handleClose -> clearDeploy() after a finished deploy and lost the
  output stream; explicit onEsc keeps state alive.

* fix(modal): swallow Esc even when defaultPrevented; fix stale phase comments

Round 4 of Copilot review on PR #223:

- Modal: hoist `stopImmediatePropagation()` above the `defaultPrevented`
  check. Previously a child widget calling preventDefault() on Esc would
  let the keystroke reach window-level handlers below us (an underlying
  MediaLightbox / voice widget). Top-most modal now always absorbs Esc;
  defaultPrevented only suppresses the close dispatch, not the propagation
  block.
- LayoutEditor / KeyboardHelp / RapidReader: comments updated to reflect
  Modal's bubble-phase listener (was capture in round 2, switched to
  bubble in round 3). RapidReader's capture-phase listener now fires
  first; KeyboardHelp's document-bubble listener still runs alongside
  Modal's window-bubble listener (both call setOpen(false), harmless).

* docs(rapidreader): clarify RapidReader owns Esc inside RapidReaderModal

Round 5 of Copilot review on PR #223:

Round 4's comment claimed Modal's listener still fires and "calls onClose
again", but RapidReader's capture-phase handler calls
stopImmediatePropagation() — Modal's bubble-phase listener never even
receives the event. Even if it somehow did, defaultPrevented (set by the
reader) would suppress the close dispatch. Update the comment to reflect
that RapidReader unilaterally owns Esc inside the wrapper.

* fix(modal): listener at module init + ref-based handlers + align='none'

Round 6 of Copilot review on PR #223:

- Modal: register the global window keydown listener at module init
  (not lazily on first push). `stopImmediatePropagation` only blocks
  listeners registered AFTER this one on the same target, so install
  order matters. Modal is imported very early through the layout shell,
  so this gives reliable precedence over MediaLightbox / VoiceWidget /
  CityFilterBar window listeners that mount only when their pages open.
- Modal: split stack push/pop from handler-identity tracking. Handlers
  now live behind refs (onCloseRef / onEscRef / closeOnEscRef), updated
  on every render. The stack-registration effect depends only on `open`
  toggling. Prevents the bug where DeployPanel's per-render onClose
  identity churned the effect, popping + re-pushing the modal and
  shuffling stack order when multiple modals were open.
- Modal: new `align='none'` variant — same flex centring as 'center'
  but without the default `p-4`. For callers whose pre-refactor overlay
  was edge-to-edge.
- ResumeAgentModal: switch to `align='none'`. Pre-refactor overlay was
  `fixed inset-0 ... flex items-center justify-center` with no `p-*`;
  preserves that so the panel reaches viewport edges on small screens.

* fix(media): pickLoraFilenames falls back to legacy loraPaths

Round 7 of Copilot review on PR #223:

Legacy image sidecars (pre-refactor) persisted absolute `loraPaths`
instead of basenames in `loraFilenames`. normalizeImage already handles
this for the card display, but getRenderConfigForItem only read the new
field — so re-queueing a legacy render through <PromptRefineModal>
would drop the LoRA config silently.

Add `pickLoraFilenames(raw)` that tries (in order):
  1. raw.loraFilenames
  2. raw.lora_filenames
  3. raw.loraPaths    → reduce to basenames
  4. raw.lora_paths   → same

Two new tests in normalize.test.js cover the legacy `loraPaths` and
`lora_paths` paths. 3982 server tests green (was 3980).

* fix(modal): HMR-safe Esc listener; docs cleanups

Round 8 of Copilot review on PR #223:

- Modal: guard the module-load addEventListener against duplicate
  install on Vite HMR. Uses a `globalThis.__portos_modal_esc_installed`
  flag and an `import.meta.hot.dispose` teardown so hot-edits to this
  file (or its importers) re-install cleanly against a fresh
  modalStack/escHandlers instead of stacking stale listeners.
- Modal: drop the inaccurate "imported early by App.jsx" claim. App.jsx
  doesn't import Modal directly — components that use it import it
  themselves. Acknowledge that install-order is consistent in practice
  because Modal is transitively pulled in by enough always-mounted
  shells, not by a deliberate early-import.
- normalize.js: reflow the `pickLoraFilenames` + `getRenderConfigForItem`
  comment block — the previous insertion left a mid-sentence orphan
  ("callers may override it...") and split a function-level comment
  across two unrelated blocks. Now reads as two clearly scoped
  comments: one above the helper, one above the exported function.

* fix(modal): install Esc listener on document too to beat earlier-bound handlers

Round 9 of Copilot review on PR #223:

Bubble-phase event order on a keystroke is: target → ... → document →
window. A document-level keydown handler installed by another component
(useKeyboardHelp, AddToCollectionMenu's click-away effect, etc.) would
fire BEFORE our window-level listener and could react to Esc while a
modal is open, dismissing the wrong layer.

Install handleGlobalEsc at both window AND document. stopImmediate
Propagation at either target halts all subsequent listeners on that
keystroke, so we still only dispatch onClose once. The HMR dispose
hook tears down both listeners.

* address review: useId for Modal id + drop redundant `open` hard-code

- Modal.jsx: replace module-scope `modalIdSeq++` counter (mutated at
  render time as a side effect) with React's `useId()`. StrictMode-safe
  and avoids burning ids on extra renders that never reach an effect.
- PromptRefineModal.jsx: pass `<Modal open={open} ...>` instead of
  hard-coded `<Modal open ...>`. The component already guards
  `if (!open || !item) return null;` so the prop forwarding is the
  honest representation — and keeps the guard's intent obvious if a
  future refactor moves it.
- KeyboardHelp.jsx comment thread is now stale (current comment
  already says useKeyboardHelp's listener is suppressed via Modal's
  `stopImmediatePropagation`); resolving without code change.

Client build + 4121 server tests green.
#220)

* feat(voice): ui_read + destructive-confirm gate + proactive CoS speech

- ui_read tool returns visible page text (8 KB cap, word-boundary trim)
  by piggy-backing on the existing voice:ui:index push.
- Destructive-action confirmation gate: ui_click on Delete/Remove/
  Discard/Reset/Clear stashes a pending record; next utterance resolves
  via affirmative/negative/passthrough state machine in confirmGate.js.
  Pending entries expire after 60s.
- Proactive CoS speech: server-pushed voice:speak event over Socket.IO
  with quiet-hours suppression (local-timezone HH:MM window) and an
  enabled-flag. POST /api/voice/speak exposes the path to internal
  callers. Quiet hours + barge-in are pure-function decidable for tests.
- Settings → Voice gains proactive enable + quiet-hours window controls.
- +91 server tests across confirmGate.test.js, proactiveSpeech.test.js,
  tools.test.js (ui_read + ui_click gate), voice.test.js (/speak route).

* fix(voice): address Copilot review on PR #220

- pipeline.js: speakSyntheticReply now calls rememberTtsSentence() before
  emitting voice:tts:audio so "Confirmed — …" / "Cancelled." get added to
  the echo-suppression buffer. Previously the synthetic reply path could
  round-trip through the mic and be misclassified as user intent.

- confirmGate.js: buildPending() now accepts an injected createdAt for
  deterministic tests; Date.now() remains the call-site default. Module
  header softened — clock is injectable rather than absent. Added a unit
  test for the injected-clock path.

- sockets/voice.js: voice:ui:index text cap aligned to 8000 chars
  (was 8192) so it matches the client-side MAX_TEXT_CHARS in domIndex.js
  end-to-end. Added word-boundary truncation (extracted as
  truncateOnWordBoundary, exported for testing) so the ~8 KB cap on
  ui_read is enforced identically whether truncation happens client-
  side (well-behaved widget) or server-side (runaway/malicious client).
  Added focused unit tests for the helper.

- domIndex.js: updated the MAX_TEXT_CHARS comment to point at the
  matching server-side cap instead of the stale ui_read.execute
  reference.

* fix(voice): address second-round Copilot review

- pipeline.js confirmation-gate execute path: drop the stale ref from the
  re-issued voice:ui:click and emit label-only. The original ref was
  captured one turn earlier and may now point at a different element
  after a DOM re-index; uiInteract.resolve() prefers ref over label so
  the post-confirmation click could fire on the wrong control.

- sockets/voice.js: clear state.pendingDestructive on voice:interrupt
  and voice:reset so the user's next "yes" can't consume a stale,
  abandoned destructive confirmation gate after the conversation has
  been thrown away.

- confirmGate.js: tighten resolvePending so utterances like "okay cancel"
  / "ok no" / "okay never mind" are classified as cancellations instead
  of being eaten by AFFIRM_RE's bare "ok/okay" branch. NEGATIVE_RE now
  tolerates a leading "ok/okay" filler, and the resolvePending order is
  flipped (negative beats affirmative) as a defense-in-depth guard.
  Added regression tests for the filler-prefixed cancel inputs.

- sockets/voice.js: rename MAX (elements cap) → MAX_ELEMENTS and rewrite
  the surrounding comment so the 200-element cap and the 8 KB
  MAX_UI_TEXT_CHARS text cap are no longer described in the same
  paragraph (the prior comment was misleading post-rebase).

* fix(voice): address third-round Copilot review

- confirmGate.js: anchor AFFIRM_RE / NEGATIVE_RE at both ^ AND $ so only
  short stand-alone yes/no utterances trigger the gate. Sentences that
  START with a yes/no token but keep going ("cancel the meeting", "stop
  the music", "yes I want to go to lunch") now passthrough instead of
  being short-circuited with a synthetic "Cancelled."/"Confirmed." reply
  that drops the rest of the user request. Matches the documented
  passthrough-on-ambiguous contract. Added regression tests.

- routes/voice.js: rename MAX_TEST_TEXT_LEN → MAX_VOICE_TEXT_LEN. The
  constant is shared by both /test and /speak; the old name implied it
  was /test-specific, which made it easy to forget the coupling and
  silently change one endpoint's contract by editing the other's
  limit. Updated the comment to spell out the shared-cap intent.

* fix(voice): enforce destructive-confirm gate server-side; clarify confirmGate doc

Address Copilot review on PR #220:

- pipeline.js: when a tool result returns confirmation_required:true,
  break out of the tool-call loop AND the LLM-iteration loop in the same
  turn, then speak the deterministic summary via speakSyntheticReply.
  Previously the gate relied on system-prompt instructions to keep the
  model from issuing further tool calls — brittle, because the model
  could overwrite state.pendingDestructive or fire unrelated side effects
  in the same iteration. Now enforced server-side.
- confirmGate.js: the header comment claimed negative input falls
  through to the LLM; in fact pipeline emits a synthetic 'Cancelled.'
  reply and ends the turn. Comment updated to match the implementation
  contract (negative → cancel acknowledgement; ambiguous → passthrough).
- Add pipelineConfirmShortCircuit.test.js asserting: (a) streamChat
  invoked exactly once, (b) sibling tool calls in the same iteration are
  skipped, (c) deterministic prompt is spoken, plus a regression guard
  for the non-destructive path.

* fix(voice): handle quote/punctuation either order; don't gate proactive speech on rejectingTts

Address Copilot review on PR #220:

- confirmGate.js normalize(): single-pass strip removed surrounding quotes
  before trailing punctuation, so '"confirm".' (punctuation outside the
  closing quote) became 'confirm"' and failed AFFIRM_RE — silently
  discarding a pending destructive confirmation. Now strips in both
  orders by running the pair twice. Added regression tests for both
  quote/punctuation orderings on affirmative and negative inputs.
- voiceClient.js voice:speak handler: removed the rejectingTts gate.
  rejectingTts is sticky after a stopPlayback()/interrupt until the next
  voice:transcript, but a proactive alert is its own event and shouldn't
  be dropped because the user recently interrupted a turn — that's
  exactly when proactive nudges (alerts/reminders/briefings) are most
  useful. Also reworded the surrounding comment to reflect that
  VoiceWidget DOES append proactive speech to its history (the prior
  comment claimed the opposite).

* fix(voice): wire data-voice-widget marker; trim /speak text at HTTP boundary

Address Copilot review on PR #220:

- VoiceWidget.jsx: add the data-voice-widget attribute to the widget
  root. domIndex.js TEXT_EXCLUDE_SELECTORS already referenced
  '[data-voice-widget]' to keep the widget's own conversation transcript
  out of ui_read snapshots, but no element actually carried the
  attribute — the voice agent would have ended up reading its own
  dialog back to the user. Added an inline comment explaining the
  contract so the marker isn't accidentally removed.
- routes/voice.js POST /api/voice/speak: change schema from
  z.string().min(1) to z.string().trim().min(1) so whitespace-only
  payloads ('   ', '\t', '\n') fail validation with a 400 at the HTTP
  boundary instead of falling through to speakProactive's empty-text
  branch that returns 200 { ok:false, reason:'empty' }. Matches the
  behavior of /api/voice/test. Added regression tests covering plain
  whitespace, tabs, newlines, and mixed whitespace.

* fix(voice): strip commas in confirm gate; cap proactive length; clarify ui_read description

Address Copilot review on PR #220:

- confirmGate.js normalize(): strip commas/colons/semicolons in addition
  to . ! ? — Whisper STT frequently emits a trailing comma after a
  yes/no when the user takes a beat ('yes, …'), which previously fell
  through to passthrough and silently discarded the pending destructive
  action. Added regression tests for 'yes,', 'no,', 'cancel;', etc on
  both isAffirmative and isNegative.
- proactiveSpeech.js: add MAX_PROACTIVE_TEXT_LEN (4000) cap at the
  function boundary. The /api/voice/speak Zod schema already enforces
  this at the HTTP layer, but speakProactive is also called directly by
  internal subsystems (CoS, scheduler) that bypass route validation.
  Defense in depth: a runaway caller can't trigger multi-minute
  synthesis or a multi-megabyte socket payload. Returns
  { ok:false, reason:'too-long', chars, maxChars }.
- tools.js ui_read description: previously said 'read verbatim — do NOT
  summarize' unconditionally while the summarize=true parameter
  explicitly allowed summarization. Reworded to condition the verbatim
  rule on summarize=false so the LLM-facing prompt is internally
  consistent.

* fix(voice): exclude [hidden] from ui_read; share MAX cap; lazy local time

Address Copilot review on PR #220:

- domIndex.js TEXT_EXCLUDE_SELECTORS: add [hidden] alongside
  [aria-hidden=true]. The previous list relied on TEXT_BLOCK_SELECTORS
  (main/dialog) visibility check, but tab-panel libraries commonly hide
  inactive panels with the HTML 'hidden' attribute on a child element
  inside main — so ui_read would dump inactive-tab content into the
  visible-text snapshot and the agent would 'read the page' off into a
  panel the user couldn't see.
- routes/voice.js: import MAX_PROACTIVE_TEXT_LEN from proactiveSpeech.js
  and alias it as MAX_VOICE_TEXT_LEN. Previously both were independently
  defined as 4000 and could silently drift. Now there's one source of
  truth; the route still owns its own variable name for clarity but the
  value moves in lockstep.
- proactiveSpeech.js speakProactive: only call getLocalMinutes() (which
  reads settings via getUserTimezone() + runs an Intl call) when
  cfg.llm.proactive.quietHours.enabled. When quiet hours are off, the
  decision can't depend on local time, so the async work was pure
  overhead and an avoidable failure surface. Added tests asserting
  getUserTimezone/getLocalParts are NOT called when quiet hours are
  off and ARE called when enabled.

* fix(voice): address round-4 Copilot review

- domIndex.js: combine TEXT_EXCLUDE_SELECTORS into a single querySelectorAll
  pass (was N passes per snapshot), and truncate on any whitespace (\s) so
  '\n\n' block joins don't force mid-token hard-cuts.
- server/sockets/voice.js: same whitespace-boundary fix for the end-to-end
  cap, keeping client + server truncation behavior identical.
- routes/voice.js: empty/whitespace 'source' now falls through to
  speakProactive's 'cos' default (silent drop) instead of overriding it
  with ''; missing 'io' now surfaces as 500 VOICE_IO_UNAVAILABLE so
  monitoring catches the misconfiguration rather than masking it behind
  a 200 { ok:false, reason:'no-io' }.
- routes/voice.test.js: attach a noop errorEvents listener so Node's
  EventEmitter doesn't throw when asyncHandler emits during the
  validation-error tests now that the stub io exposes that path.
- bottom-sheet drawer on mobile fullscreen (max-h-[55vh]), keeps image + chevrons reachable
- chevrons anchor to bottom-4 in fullscreen so they land in the letterbox of landscape images
- visible chevron pills (bg-black/40 + backdrop-blur) replace text-white/40 invisible-on-bright-content glyphs
- swipe threshold loosened: dx > dy×1.2 (was ×1.5), SWIPE_MIN_PX=50 (was 60)
- image-area touch handlers skip events originating in a button (fixes max/min tap leaking into drawer-toggle)
- drawerOpen auto-clears when fullScreen exits, so the stuck-state UI flash can't recur
- new MediaPreview wrapper owns close/nav/annotation boilerplate for all four gallery pages (Image Gen, Media History, Media Collection Detail, Video Gen)
- MediaCollectionDetail gains prev/next nav + swipe (it was missing previewNavProps entirely)
In fullscreen on mobile, tapping the image was opening a bottom-sheet
settings drawer that covered the image area — the drawer was absolute-
positioned over `bottom-0 max-h-[55vh]` and the image stayed centered
in its flex-1 container, so the image looked like it vanished with no
clear way back.

Remove the tap branch and the now-unreachable drawerOpen state, its
two cleanup effects, the Esc-cascade step, the chevron-position
drawer branch, the SettingsPane fullscreen aside-class branch, the
conditional aria-label, and the "tap for settings" hint. Fullscreen
on mobile is now just the image — minimize via the top-right pill to
reach settings. Swipe-to-navigate preserved.

Simplify pass: collapse SettingsPane's onClose+onPrimaryClose to a
single onClose, move `cleaning` state into SettingsPane (its only
consumer), and replace inline body-overflow stash with the shared
useScrollLock hook (ref-counts, so lightbox+refine-modal don't
clobber each other's saved value).
Snapshots now write to <destPath>/snapshots/<hostname>/<snapshotId>
so a shared iCloud destination can host backups from multiple
machines without snapshot-ID collisions. listSnapshots and
restoreSnapshot scope to the current machine, with the path-
traversal guard re-rooted to match.
…yleNotes)

Worlds now carry logline / premise / styleNotes alongside their image-prompt
template. The LLM Expand pass produces them in the same call, the World Builder
editor surfaces a "Story bible" section, and the Pipeline → New Series form
gains a world dropdown that auto-fills those three fields from the picked world
(only into empty form fields, so user typing isn't clobbered). worldId +
premise + styleNotes ride the create-series payload alongside logline.

Caps shared via WORLD_LOGLINE_MAX / WORLD_PREMISE_MAX / WORLD_STYLE_NOTES_MAX
exports from apiWorldBuilder.js so client maxLength inputs stay in lockstep
with the server sanitizer.
normalize.js: normalizeImage/normalizeVideo now reuse pickLoraFilenames
so loraNames surfaces for snake_case-only sidecars (lora_filenames /
lora_paths). Chips and search were silently empty for Python-written
records even though the requeue path through getRenderConfigForItem
resolved them. +3 tests.

MemoryEditModal / ResumeAgentModal: aria-label on the icon-only X
close buttons (icons marked aria-hidden so the button has one name).

KeyboardHelp: refresh stale Esc-stack comment now that Modal listens
on both document and window.

data/migrations/002-cd-evaluate-image-strength.js: backfill the
imageStrength surfacing block in cd-evaluate.md for installs that
predate the data.sample/ change. setup-data.js only seeds new files,
so existing data/ copies stayed stale and three creativeDirectorPrompts
tests stayed red. The migration surgically inserts both pieces only
when surrounding anchors match (hand-edited templates skipped with a
notice), and is idempotent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…— args-baked CLI models now win over defaultModel (#225)

* refactor(server): factor model resolution into resolveEffectiveModel — args-baked CLI models now win over defaultModel

Four callers (stageRunner, worldBuilderExpand, worldBuilderRefine, and
promptRunner itself) fell through to provider.defaultModel || models[0]
on the non-honoring CLI branch — but the actual model that runs for a
CLI provider with a baked --model/-m flag in provider.args is the
ARGS-PINNED id, which can diverge from defaultModel. Result: the run
record, the pre-call log line, and the returned llm.model field could
all lie when args win.

Factored mediaPromptRefiner's existing baked-model resolution into a
shared resolveEffectiveModel(provider, callerModel) in
server/lib/promptRunner.js, exported alongside
providerHonorsModelOverride. It consults hasModelFlag +
extractBakedModel from services/runner.js for the args-baked CLI case
and returns the args-pinned id; otherwise it follows the existing
override→defaultModel→models[0] chain.

Replaced the inline logic at all five sites:
  - promptRunner.runPromptThroughProvider
  - stageRunner.runStagedLLM
  - worldBuilderExpand.expandWorldTemplate
  - worldBuilderRefine.refineWorldPrompts
  - mediaPromptRefiner.refineMediaPrompt (was already the reference impl)

mediaPromptRefiner and worldBuilderRefine no longer need to import
hasModelFlag / extractBakedModel directly — they import
resolveEffectiveModel and get the same precision behind a smaller
surface.

Adds a regression test asserting that when extractBakedModel succeeds
and hasModelFlag is true, the run record + returned model reflect the
baked id (not provider.defaultModel).

Defers worldBuilderRefine.runRefine → runPromptThroughProvider full
runner-wrapper migration to PLAN.md (the model-resolution piece is
done; the createRun/executeApiRun/executeCliRun consolidation is a
separate cleanup that mirrors what PR #221 did for the other four
sites).

* address review: assert createRun records baked-in CLI model

Copilot review asked the args-baked CLI regression test to also assert
the run-record side of the bugfix. Add an explicit
runner.createRun expectation so the test pins the persisted run record
to the args-baked model id (not the silently-dropped caller override
or the provider.defaultModel fallback).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… fixes

Pre-release /do:review surfaced one runtime bug and three DRY drift fixes
against newly-shared modules. All caught before opening the release PR.

- **Proactive TTS bypassed echo-suppression.** `speakProactive` broadcasts
  via `io.emit('voice:speak')` but had no per-socket context to write into
  `state.recentTts`, so a proactive line bleeding from speakers back into
  the mic would round-trip through the LLM as user input. Adds a module-
  scope echo-buffer registry in `services/voice/echo.js`. Sockets
  register/unregister their `recentTts` array in
  `registerVoiceHandlers` / `disconnect`; `speakProactive` writes through
  `rememberTtsForAllSockets` just before emitting. Single-instance app,
  so a process-wide registry is fine.

- **`civitaiSuggestions.js` bypassed the canonical `RUNNER_FAMILIES`.**
  Hand-rolled `['mflux', 'flux2', 'z-image', 'ernie']` shadowed the new
  `Object.freeze` constant in `lib/runners.js` that was introduced
  specifically to prevent typo-drift across runner-family sites. Now
  uses `Object.values(RUNNER_FAMILIES)`.

- **`messageEvaluator.js` still used greedy `/\[[\s\S]*\]/` JSON
  extraction.** The shared `lib/jsonExtract.js` consolidates banner
  stripping, trailing-comma repair, and `[...]` placeholder elision for
  three sibling extractors — `messageEvaluator` was the fourth, missed
  in the original consolidation. Now routes through `extractJson` with
  `blockType: 'array'` so triage runs against Codex don't trip on the
  CLI banner.

- **`deepMerge` prototype-pollution defense.** The shared helper now
  skips `__proto__` / `constructor` / `prototype` keys. Every current
  caller gates through Zod so today's call sites are safe, but the
  helper backs three routes and a future caller handing it `req.body`
  or an LLM tool-call payload directly would pollute `Object.prototype`.
  Two-line guard; no behavior change for valid input.

Tests: 4344 passing (+4 new — echo-registry fan-out + proactive-speech
echo-buffer integration). Skipping the larger `worldBuilderRefine`
runner-consolidation (already deferred in PLAN.md) and several cosmetic
findings (UI_KINDS shared module, ui_click `confirmed` dead flag,
mediaModels.test.js dead dynamic-import) — not release-blocking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- messageEvaluator: drop `shapePredicate: Array.isArray` from the extractJson
  call — `blockType: 'array'` already constrains the walker to `[...]`
  blocks, so a successful parse is always an Array.
- proactiveSpeech: collapse the 6-line "why register in echo buffer" block
  to one line. The function name carries the intent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-PR coherence pass caught that four of PR #224's five features had no
changelog line: live per-panel thumbs, single-scene storyboard video,
voice stage navigation tools, and the dynamic pipeline sidebar. Added
entries for each under "Added"; the AI prompt-refine entry already
existed.
@atomantic atomantic requested a review from Copilot May 13, 2026 06:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants