Dataset reimplementation by shagun-singh-inkeep · Pull Request #2341 · inkeep/agents

shagun-singh-inkeep · 2026-02-25T04:03:50Z

No description provided.

changeset-bot · 2026-02-25T04:03:54Z

⚠️ No Changeset found

Latest commit: ba763d5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2026-02-25T04:03:54Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Mar 31, 2026 5:58pm
agents-docs	Ready	Preview, Comment	Mar 31, 2026 5:58pm
agents-manage-ui	Error		Mar 31, 2026 5:58pm

itoqa · 2026-02-25T05:52:12Z

Ito Test Report ❌

25 test cases ran. 20 passed, 5 failed.

This test run verified the Dataset Reimplementation feature (PR #2341), which re-enables the "Test Suites" sidebar link, adds new dataset run endpoints, and replaces old chat-API-based execution with scheduled trigger infrastructure. The core routing, API endpoints, and backend infrastructure are working correctly. However, several UI bugs were identified: the run detail page incorrectly displays progress and invocation data when all agent executions fail, the cross-product display groups by dataset item instead of showing per-invocation rows, and unauthenticated users can access protected UI routes.

✅ Passed (20)

Test Case	Summary	Timestamp
ROUTE-1	Verified Test Suites link appears under Monitor section in sidebar with Database icon. Clicking it navigates to /default/projects/activities-planner/datasets showing dataset listing.	1:54
ROUTE-2	Created run config 'Test Run E2E' with 1 agent, auto-trigger fired creating 3 invocations (3 items x 1 agent), run appeared in runs list	6:16
ROUTE-3	Run detail page loads at correct URL pattern, displays run name 'Test Run E2E', shows 'Run in progress' indicator with spinner, progress bar and test cases table with 3 items.	14:35
ROUTE-5	All filter mechanisms work correctly: search input filters by text, Show Filters expands filter panel, Agent and Output Status dropdowns work, Clear Filters resets all.	28:07
ROUTE-6	GET dataset-runs/by-dataset endpoint returns 200 with run objects containing id, datasetId, status, totalItems, completedItems, failedItems.	32:09
ROUTE-7	GET dataset-runs/{runId}/items returns 200 with invocation objects containing id, agentId, datasetRunId, datasetItemId, status, attemptNumber, conversationId.	32:50
ROUTE-8	Trigger endpoint created 3 invocations (3 dataset items x 1 agent). API response shows totalItems=3, status=completed.	11:07
ROUTE-9	Created evaluator and run config with evaluator attached. Run detail page shows View Evaluation Job button. Clicking opens new tab with correct URL format.	48:55
ROUTE-10	Clicking 'Back to test suite' button on run detail page navigates to dataset detail page with the Runs tab active.	19:46
EDGE-1	Created run config on empty dataset (0 items). API confirmed totalItems:0. Run detail page correctly shows Test Cases (0) with No items found message.	51:12
EDGE-4	API response shows run with all items failed (failedItems=3, completedItems=0) has status=completed. Confirms deriveRunStatus correctly returns completed when pending+running=0.	39:44
EDGE-6	Timestamps display in local timezone format using browser's locale settings via Intl.DateTimeFormat.	14:44
EDGE-7	API run status metadata has only totalItems/completedItems/failedItems fields. Cancelled invocations are counted under failedItems.	40:24
EDGE-8	Created run config via API without triggering. Config exists in system but does not appear in runs list. Confirms partial failure handling.	44:37
ADV-1	POST to trigger with non-existent runConfigId returns HTTP 404 with body {"res":{},"status":404}.	33:22
ADV-3	Clicked Create Run with empty Name field, validation message 'Name is required' appeared preventing submission.	7:04
ADV-4	Rapidly triple-clicked Create Run button. Only 1 run config was created. Button disabled after first click prevents duplicates.	12:37
ADV-5	Navigated to non-existent runId. Page shows error state with Error title and HTTP 404: Not Found message.	35:33
ADV-6	Created dataset item with script tag content. Verified HTML/script content rendered as plain text in JSON code view. No script execution detected.	37:29
LOGIC-1	POST /evals/run-dataset-items returns HTTP 404 Not Found. The old route has been successfully removed.	33:51

❌ Failed (5)

Test Case	Summary	Timestamp
ROUTE-4	Table shows incorrect data: Agent column shows '-', Output shows 'Processing...' for failed items, Test Cases header shows (0) despite 3 rows displayed.	25:55
EDGE-2	API shows totalItems:0 but UI incorrectly shows 4 items with Agent dash and Processing status stuck.	54:28
EDGE-3	API confirms 8 invocations (4 items x 2 agents) but UI only shows 4 rows grouped by dataset item.	59:59
EDGE-5	Auto-refresh never stops. UI stuck showing 'Run in progress' even though API reports run completed with all items failed.	14:59
ADV-2	API correctly returns 401 for unauthenticated requests, but UI loads datasets page without redirecting to login.	1:04:21

ROUTE-4: Dataset run detail shows test cases table with correct data – Failed

Where: Run detail page at /{tenantId}/projects/{projectId}/datasets/{datasetId}/runs/{runId}
Steps to reproduce:
1. Navigate to a run detail page where all agent invocations have failed
2. Observe the Test Cases table
What failed: The table displays incorrect information: (1) Agent column shows '-' instead of the agent ID, (2) Output column shows 'Processing...' with spinner for items where the API confirms status='failed', (3) Test Cases section header shows count '(0)' despite 3 rows being displayed in the table.
Code analysis: The UI derives display state from conversations array on each item. When invocations fail before creating a conversation, the conversations array is empty. The code at line 394-434 shows a placeholder row with Agent='-' and "Processing..." when no conversations exist, regardless of actual invocation status.

Relevant code:

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/datasets/[datasetId]/runs/[runId]/page.tsx (lines 394–435)

const conversations = item.conversations || [];
if (conversations.length === 0) {
  // No conversations yet - show placeholder row with loading state if run is in progress
  return (
    <TableRow key={item.id}>
      <TableCell>
        {/* ... */}
      </TableCell>
      <TableCell>
        <span className="text-sm text-muted-foreground">-</span>
      </TableCell>
      <TableCell>
        {conversationProgress.isRunning ? (
          <span className="flex items-center gap-2 text-sm text-muted-foreground">
            <Loader2 className="h-3 w-3 animate-spin" />
            Processing...
          </span>
        ) : (
          <span className="text-sm text-muted-foreground">No output</span>
        )}
      </TableCell>
      {/* ... */}
    </TableRow>
  );
}

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/datasets/[datasetId]/runs/[runId]/page.tsx (lines 324–330)

<CardTitle>
  Test Cases (
  {filteredItems.reduce((acc, item) => acc + (item.conversations?.length || 0), 0)}{' '}
  {/* This counts conversations, not invocations - shows 0 when all fail */}
  )
</CardTitle>

Why this is likely a bug: The UI should display invocation status (from scheduled trigger invocations) rather than relying solely on conversations. When invocations fail, no conversation is created, but the UI should still show the failure status and agent ID from the invocation data.
Introduced by this PR: Yes – this PR modified the relevant code. The run detail page was part of the dataset reimplementation.
Timestamp: 25:55

EDGE-2: Run config with no agent relations produces zero invocations – Failed

Where: Run detail page for a run config created with no agents selected
Steps to reproduce:
1. Create a run config with no agents selected on a populated dataset (4 items)
2. Navigate to the run detail page
What failed: API correctly shows totalItems:0 for the run. However, the UI run detail page incorrectly shows 4 items with Agent column as dash, all showing "Processing..." with "Run in progress" status stuck. The UI displays dataset items even when there are no invocations.
Code analysis: The backend at datasetRuns.ts line 213-217 fetches ALL dataset items via listDatasetItems(db) regardless of how many invocations exist. The UI displays these items, creating a disconnect between what the API reports (0 invocations) and what the UI shows (4 dataset items with placeholder rows).

Relevant code:

agents-api/src/domains/manage/routes/evals/datasetRuns.ts (lines 213–217)

// Get all dataset items for this dataset
const datasetItems = await listDatasetItems(db)({
  scopes: { tenantId, projectId, datasetId: run.datasetId },
});

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/datasets/[datasetId]/runs/[runId]/page.tsx (lines 106–113)

const conversationProgress = useMemo(() => {
  if (!run?.items) return { total: 0, completed: 0, isRunning: false };
  const total = run.items.length;  // Uses dataset items count, not invocations
  const completed = run.items.filter(
    (item) => item.conversations && item.conversations.length > 0
  ).length;
  return { total, completed, isRunning: completed < total && total > 0 };
}, [run]);

Why this is likely a bug: When no agents are selected, the run has zero invocations (totalItems=0). The UI should show "No test cases" or display based on actual invocation count from the API's status metadata, not based on the number of dataset items.
Introduced by this PR: Yes – this PR modified the relevant code in both the API endpoint and UI page.
Timestamp: 54:28

EDGE-3: Run with multiple agents and items creates correct cross-product of invocations – Failed

Where: Run detail page for a run with 4 items x 2 agents
Steps to reproduce:
1. Create a run config selecting 2 agents on a dataset with 4 items
2. Trigger the run and navigate to the run detail page
What failed: The API correctly reports totalItems=8 (4 items x 2 agents = 8 invocations). However, the UI run detail page only displays 4 rows (grouped by dataset item) instead of the expected 8 rows (one per agent-item combination). The progress bar shows '0 of 4 completed' instead of '0 of 8'.
Code analysis: The UI iterates over filteredItems (dataset items) and then maps over item.conversations. Since no conversations were created (all failed), each dataset item shows a single placeholder row. The UI structure is designed to show one row per conversation, not one row per invocation.

Relevant code:

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/datasets/[datasetId]/runs/[runId]/page.tsx (lines 356–360)

{filteredItems.flatMap((item) => {
  // ...
  const conversations = item.conversations || [];
  if (conversations.length === 0) {
    // Shows ONE placeholder row per dataset item, not per invocation
    return (
      <TableRow key={item.id}>

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/datasets/[datasetId]/runs/[runId]/page.tsx (lines 437–481)

return conversations.map((conversation) => (
  <TableRow key={`${item.id}-${conversation.conversationId}`}>
    {/* This correctly shows one row per conversation when they exist */}
  </TableRow>
));

Why this is likely a bug: The UI should display one row per scheduled trigger invocation (from the API's invocations data), not per dataset item. When multiple agents are selected, the cross-product creates N x M invocations, and the UI should reflect this.
Introduced by this PR: Yes – this PR introduced the dataset run detail page as part of the reimplementation.
Timestamp: 59:59

EDGE-5: Auto-refresh stops when run completes – Failed

Where: Run detail page during and after run completion
Steps to reproduce:
1. Navigate to a run detail page where all invocations have failed
2. Observe the auto-refresh behavior and UI state
What failed: Auto-refresh does NOT stop when the run completes. The API reports status=completed with 3 failed items, but the UI remains stuck showing 'Run in progress' with '0 of 3 completed'. The polling continues indefinitely via repeated requests every ~3 seconds.
Code analysis: The isRunInProgress flag at line 116-117 depends on conversationProgress.isRunning. The conversation progress calculates completion based on conversations created. When all invocations fail, no conversations are created, so completed is always 0 and total is always > 0 (dataset items count), making isRunning perpetually true.

Relevant code:

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/datasets/[datasetId]/runs/[runId]/page.tsx (lines 106–117)

const conversationProgress = useMemo(() => {
  if (!run?.items) return { total: 0, completed: 0, isRunning: false };
  const total = run.items.length;
  const completed = run.items.filter(
    (item) => item.conversations && item.conversations.length > 0
  ).length;
  return { total, completed, isRunning: completed < total && total > 0 };
}, [run]);

// Overall progress - run is complete only when both conversations AND evaluations are done
const isRunInProgress =
  conversationProgress.isRunning || (evaluationProgress?.isRunning ?? false);

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/datasets/[datasetId]/runs/[runId]/page.tsx (lines 124–133)

useEffect(() => {
  if (!isRunInProgress) return;

  const interval = setInterval(() => {
    loadRun(false); // Don't show loading state for refresh
  }, 3000); // Refresh every 3 seconds

  return () => clearInterval(interval);
}, [isRunInProgress, loadRun]);

Why this is likely a bug: The UI should use the API's reported status (from deriveRunStatus which returns 'completed' when pending+running=0) to determine if the run is complete, not rely on conversation count. This causes infinite polling and a permanently stuck "in progress" state for any run where invocations fail.
Introduced by this PR: Yes – this PR introduced the run detail page with auto-refresh functionality.
Timestamp: 14:59

ADV-2: Accessing dataset routes without authentication returns 401/403 – Failed

Where: UI datasets page at /default/projects/activities-planner/datasets
Steps to reproduce:
1. Clear all cookies (unauthenticated state)
2. Navigate directly to the datasets page URL
What failed: The API (port 3002) correctly returns 401 Unauthorized for unauthenticated requests. However, the UI (port 3000) does NOT redirect unauthenticated users to a login page. After clearing all cookies, navigating to the datasets page renders the full page with data visible.
Code analysis: The agents-manage-ui app does not have a Next.js middleware.ts file for route protection. The tenant layout ([tenantId]/layout.tsx) renders content without checking authentication status. Authentication is handled client-side via the AuthClientProvider context, but there's no server-side redirect for unauthenticated users.

Relevant code:

agents-manage-ui/src/app/[tenantId]/layout.tsx (lines 9–42)

const Layout: FC<LayoutProps<'/[tenantId]'>> = ({ children, breadcrumbs }) => {
  return (
    <AppSidebarProvider>
      <SentryScopeProvider>
        <SidebarInset>
          {/* Layout renders without auth check */}
          <main>
            <div className="flex-1 p-6">{children}</div>
          </main>
        </SidebarInset>
      </SentryScopeProvider>
    </AppSidebarProvider>
  );
};

agents-manage-ui/src/lib/api/api-config.ts (lines 44–68)

// API requests include bypass secret for server-side calls
const headers: HeadersInit = {
  'Content-Type': 'application/json',
  ...(isServer && process.env.INKEEP_AGENTS_MANAGE_API_BYPASS_SECRET
    ? {
        Authorization: `Bearer ${process.env.INKEEP_AGENTS_MANAGE_API_BYPASS_SECRET}`,
      }
    : {}),
};

Why this is likely a bug: Protected routes should redirect unauthenticated users to the login page. The current implementation allows direct access to UI pages that display protected data because server-side rendering uses a bypass secret, but then renders the page to an unauthenticated user.
Introduced by this PR: No – pre-existing bug (authentication code not changed in this PR). However, this PR re-enabled the datasets routes which expose this issue.
Timestamp: 1:04:21

📋 View Recording

Screen Recording

itoqa · 2026-02-25T16:30:04Z

Ito Test Report ❌

32 test cases ran. 31 passed, 1 failed.

Testing verified the Dataset (Test Suite) reimplementation in PR #2341. The core functionality works well: sidebar navigation, dataset CRUD operations, tab switching, item creation, run config creation, run progress tracking, auto-refresh, filtering, XSS prevention, and error handling all passed. One validation bug was confirmed in the run config form where submitting without agents selected does not show an error.

✅ Passed (31)

Test Case	Summary	Timestamp	Screenshot
ROUTE-1	Verified sidebar has Monitor section with Test Suites link positioned between Traces and Evaluations	3:38
ROUTE-2	Datasets page shows empty state with 'No test suites yet.' heading, description text, and 'Create test suite' link	5:12
ROUTE-3	Created dataset 'Playwright Test Suite' via the create form	6:14
ROUTE-4	Verified default tab is Items, clicking Runs tab shows runs content with URL ?tab=runs	7:54
ROUTE-5	Created a dataset item with role 'user' and content 'What is the weather in San Francisco?'	9:21
ROUTE-6	Successfully created run config 'Test Run Alpha' with Activities Planner agent selected	20:03
ROUTE-7	Run detail page shows progress tracking with 'Run in progress' banner, progress bar, and test cases table	20:29
ROUTE-8	Observed auto-refresh on run detail page with timestamp progressing from 'just now' to '2m ago'	22:38
ROUTE-9	Verified search filter, Show/Hide Filters toggle, Output Status filter, and Clear Filters button	25:54
ROUTE-10	DatasetItemViewDialog opened showing full Input messages with role and content	27:12
ROUTE-12	View Evaluation Job button appears on run detail page when evaluators are attached	42:08
ROUTE-13	Run detail page shows dual progress tracking for Test cases and Evaluations	42:08
ROUTE-14	Runs list shows 'Test Run Alpha' with relative creation timestamp and chevron icon	20:04
ROUTE-15	Run At column shows local timezone format, Created shows relative timestamp with clock icon	20:32
ROUTE-16	Runs tab empty state showing 'No runs yet' text and 'Add first run' button	48:32
ROUTE-17	Back to test suite button navigates to dataset page with Runs tab selected	32:32
ROUTE-18	Run config form showed 'Loading agents...' and 'Loading evaluators...' during data load	48:53
EDGE-1	Triggered run on empty dataset, graceful handling with 'No items found' message	52:58
EDGE-3	Validation error 'Name is required' displayed when submitting empty name	49:39
EDGE-4	Run detail page shows pending items correctly with 'Processing...' spinner and 'Pending...' text	20:30
EDGE-6	Created Run B and Run C in quick succession, both appear as separate entries	60:04
EDGE-7	Tab state persists via URL query parameter ?tab=runs	10:18
EDGE-8	Complex message content formats all display correctly in run detail table	60:43
EDGE-9	Search for non-matching term shows 'No test cases match the current filters' message	33:09
EDGE-10	Long input text truncated at ~100 chars with ellipsis, dialog shows full content	60:46
EDGE-11	Runs list shows skeleton loading placeholders during data fetch	65:32
ADV-1	XSS payload rendered as plain escaped text, no script execution	68:50
ADV-2	Non-existent run ID shows Error card with HTTP 404 Not Found	69:57	ADV-2_69-57.png
ADV-3	Invalid tab query parameter falls back gracefully, tab switching works normally	10:56
ADV-4	Dev mode auto-authenticates, no redirect to login page	0:00
ADV-5	Rapid double-click on Create Run button prevented duplicate creation	19:26

❌ Failed (1)

Test Case	Summary	Timestamp	Screenshot
EDGE-2	Form submitted successfully with 0 agents selected - expected validation error but got success	50:26

EDGE-2: Run config form with no agents selected validation – Failed

Where: Dataset run config creation form dialog
Steps to reproduce:
1. Navigate to a dataset's Runs tab
2. Click 'Add first run' or 'New run' button
3. Enter a name in the Name field (e.g., 'Validation Test Run')
4. Do NOT select any agents from the Agents multi-selector
5. Click 'Create Run' button
What failed: Expected a validation error preventing form submission when no agents are selected. Instead, the form submitted successfully, creating a run with 0 agents. The success toast 'Run config created successfully' appeared.
Code analysis: Examined the form validation schema and found the root cause. The UI shows the Agents field with an isRequired indicator (asterisk), but the Zod validation schema does not enforce a minimum of one agent.

Relevant code:

agents-manage-ui/src/components/datasets/form/dataset-run-config-validation.ts (lines 3–8)

export const datasetRunConfigSchema = z.object({
  name: z.string().min(1, 'Name is required'),
  description: z.string().optional(),
  agentIds: z.array(z.string()).default([]),  // Bug: missing .min(1) validation
  evaluatorIds: z.array(z.string()).default([]),
});

agents-manage-ui/src/components/datasets/form/dataset-run-config-form.tsx (lines 177–181)

<FormItem>
  <div className="flex items-center gap-2">
    <FormLabel isRequired>Agents</FormLabel>  {/* Shows required indicator */}
    <Badge variant="count">{(agentIds as string[]).length}</Badge>
  </div>

Why this is likely a bug: The UI displays an asterisk (isRequired) on the Agents label indicating it's a required field, but the Zod schema only uses .default([]) without .min(1, ...). This creates a mismatch where users see a required indicator but can submit without selecting any agents. The fix is to change line 6 to: agentIds: z.array(z.string()).min(1, 'At least one agent is required').
Introduced by this PR: Yes – this PR modified the relevant code. This PR re-enabled the dataset run configs routes and modified the dataset run config actions. While the validation file itself may not be new, the feature re-enablement means this validation gap is now exposed to users.
Timestamp: 50:26

📋 View Recording

Screen Recording

Made-with: Cursor

itoqa · 2026-03-10T18:45:32Z

Ito Test Report ❌

11 test cases ran. 10 passed, 1 failed.

The run validated core feedback, branch, and dataset flows that were executable in this environment. One user-facing defect was confirmed through code inspection: invalid feedback query parameters are forwarded without bounds sanitization, which can trigger a hard load error instead of graceful coercion.

✅ Passed (10)

Test Case	Summary	Timestamp
ROUTE-1	Feedback page loaded at /default/projects/default/feedback without runtime crash and displayed a valid empty state.	0:00
ROUTE-2	Created positive message-scoped feedback via localhost API fallback and verified the positive row with messageId renders in Feedback UI.	10:45
ROUTE-5	UI delete removed the feedback row and repeat delete via API returned not found, confirming non-false-success behavior after deletion.	10:45
ROUTE-8	Clean branch merge API returned success and no conflicts.	14:07
ROUTE-10	Non-main branch deletion succeeded and protected main-branch deletion was correctly rejected.	14:07
ROUTE-11	Created a dataset run config with an agent relation and verified automatic run creation in UI; API trigger endpoint returned 202 with datasetRunId.	38:01
ROUTE-12	Run detail showed consistent status and counters, and dataset-runs items API returned 200 with matching datasetRunId, status, and attempt fields for all items.	38:16
EDGE-1	Branches page rendered a valid empty state with 'No branches' messaging and no broken table artifacts.	14:07
ADV-2	Rapid repeated clicks on merge and delete confirmations produced one effective mutation each pending cycle; UI prevented duplicate destructive requests and ended with a single branch deletion outcome.	25:53
ADV-3	Unauthorized feedback create and branch merge mutation calls were both denied with 401 responses, confirming mutation boundaries were enforced.	42:23

❌ Failed (1)

Test Case	Summary
EDGE-2	Invalid feedback query parameters rendered a failed-load state instead of being safely coerced to valid pagination bounds.

Feedback page query-param coercion – Failed

Where: Feedback route (/{tenantId}/projects/{projectId}/feedback) server-side load path.
Steps to reproduce: Open the feedback page with out-of-range pagination params (for example ?page=999999&limit=100000).
What failed: The page passes unbounded numeric query values directly to the API, receives a validation error for oversized limit, and falls into the full-page error state instead of coercing inputs to safe bounds.
Code analysis: The page parser only checks numeric parse/finite-ness, not bounds; the API route validates query params with strict pagination schema, so oversized values are rejected and bubble up to error UI.

Relevant code:

agents-manage-ui/src/app/[tenantId]/projects/[projectId]/feedback/page.tsx (lines 31-39)

const pageNumber = page ? Number.parseInt(page, 10) : 1;
const limitNumber = limit ? Number.parseInt(limit, 10) : 25;

const response = await fetchFeedback(tenantId, projectId, {
  conversationId,
  messageId,
  page: Number.isFinite(pageNumber) ? pageNumber : 1,
  limit: Number.isFinite(limitNumber) ? limitNumber : 25,
});

agents-api/src/domains/manage/routes/feedback.ts (lines 39-43)

query: PaginationQueryParamsSchema.extend({
  conversationId: z.string().optional().describe('Optionally filter by conversation ID'),
  messageId: z.string().optional().describe('Optionally filter by message ID'),
}),

agents-api/src/domains/manage/routes/feedback.ts (lines 58-65)

const { conversationId, messageId, page = 1, limit = 10 } = c.req.valid('query');

const result = await listFeedback(runDbClient)({
  scopes: { tenantId, projectId },
  conversationId,
  messageId,
  pagination: { page, limit },
});

Why this is likely a bug: The UI path explicitly intends query-param handling for feedback pagination, but out-of-range values are not sanitized before strict API validation, producing a user-visible load failure.
Introduced by this PR: Yes - this PR modified the relevant code.

📋 View Recording

Screen Recording

vercel Bot temporarily deployed to Preview – agents-docs February 25, 2026 04:04 Inactive

vercel Bot deployed to Preview – agents-manage-ui February 25, 2026 04:06 View deployment

shagun-singh-inkeep marked this pull request as draft February 25, 2026 04:09

This comment was marked as outdated.

Sign in to view

github-actions Bot deleted a comment from claude Bot Feb 25, 2026

vercel Bot temporarily deployed to Preview – agents-docs February 25, 2026 14:45 Inactive

vercel Bot temporarily deployed to Preview – agents-docs February 25, 2026 14:46 Inactive

vercel Bot deployed to Preview – agents-api February 25, 2026 14:48 View deployment

shagun-singh-inkeep and others added 13 commits February 26, 2026 16:26

feedback schema

d38e910

build

55687c7

rename

c5270d9

inital

a9ad391

manage ui

2050b80

remove temp

aeb2192

refactor

f6c44aa

lint

dea3ee1

style: auto-format with biome

c7534f2

cleaner queue

c0ea69f

style: auto-format with biome

cf4c210

Merge branch feedback into dataset-reimplementation

4060f4b

base implementation of feedback

6b72a69

shagun-singh-inkeep force-pushed the dataset-reimplementation branch from d4643e1 to 6b72a69 Compare March 3, 2026 21:53

more

4e9a311

vercel Bot temporarily deployed to Preview – agents-docs March 3, 2026 21:54 Inactive

vercel Bot deployed to Preview – agents-api March 3, 2026 21:57 View deployment

vercel Bot had a problem deploying to Preview – agents-manage-ui March 3, 2026 21:58 Failure

shagun-singh-inkeep added 2 commits March 4, 2026 11:24

add

7e5ee25

fix

7cbda2a

vercel Bot had a problem deploying to Preview – agents-api March 10, 2026 16:42 Failure

vercel Bot deployed to Preview – agents-docs March 10, 2026 16:44 View deployment

vercel Bot had a problem deploying to Preview – agents-manage-ui March 10, 2026 16:45 Failure

Remove simulation agent

e9bc893

Made-with: Cursor

shagun-singh-inkeep force-pushed the dataset-reimplementation branch from 0776302 to e9bc893 Compare March 10, 2026 16:47

vercel Bot deployed to Preview – agents-docs March 10, 2026 16:49 View deployment

vercel Bot deployed to Preview – agents-api March 10, 2026 16:49 View deployment

vercel Bot had a problem deploying to Preview – agents-manage-ui March 10, 2026 16:50 Failure

fixes

ba763d5

vercel Bot had a problem deploying to Preview – agents-manage-ui March 31, 2026 17:58 Failure

vercel Bot deployed to Preview – agents-api March 31, 2026 17:58 View deployment

vercel Bot deployed to Preview – agents-docs March 31, 2026 17:58 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset reimplementation#2341

Dataset reimplementation#2341
shagun-singh-inkeep wants to merge 18 commits intomainfrom
dataset-reimplementation

shagun-singh-inkeep commented Feb 25, 2026

Uh oh!

changeset-bot Bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

itoqa Bot commented Feb 25, 2026

Uh oh!

itoqa Bot commented Feb 25, 2026

Uh oh!

itoqa Bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shagun-singh-inkeep commented Feb 25, 2026

Uh oh!

changeset-bot Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

vercel Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

itoqa Bot commented Feb 25, 2026

Ito Test Report ❌

Uh oh!

itoqa Bot commented Feb 25, 2026

Ito Test Report ❌

Uh oh!

itoqa Bot commented Mar 10, 2026

Ito Test Report ❌

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot Bot commented Feb 25, 2026 •

edited

Loading

vercel Bot commented Feb 25, 2026 •

edited

Loading