Skip to content

feat(providers): support sandbox provider attach lifecycle#1242

Open
johntmyers wants to merge 5 commits intomainfrom
feat/1171-provider-attach-detach/johntmyers
Open

feat(providers): support sandbox provider attach lifecycle#1242
johntmyers wants to merge 5 commits intomainfrom
feat/1171-provider-attach-detach/johntmyers

Conversation

@johntmyers
Copy link
Copy Markdown
Collaborator

@johntmyers johntmyers commented May 7, 2026

Summary

  • Add provider attachment lifecycle APIs for sandboxes: list, attach, and detach.
  • Add CLI support via openshell sandbox provider list|attach|detach.
  • Keep effective policy and provider credential environment resolution derived from the current persisted sandbox provider attachments.
  • Refresh provider credentials in running sandboxes for future SSH/exec launches using generation-scoped credential snapshots.

Related Issue

Closes #1171

Changes

  • Added gRPC/proto methods for listing, attaching, and detaching sandbox providers.
  • Implemented gateway handlers that mutate SandboxSpec.providers, validate provider existence on attach, and make attach/detach idempotent.
  • Added auth scope mappings for the new sandbox provider RPCs.
  • Added CLI commands for provider attachment lifecycle management.
  • Added provider environment revisions to config/env responses so the sandbox poll loop can detect provider attach, detach, and credential updates.
  • Added sandbox-side provider credential snapshots with revision-scoped placeholders. Entrypoint processes keep the startup snapshot; future SSH/exec/SFTP launches read the latest snapshot.
  • Updated the proxy resolver path to retain recent credential generations so already-running processes can still resolve old placeholders.
  • Block provider deletion while attached to any sandbox to avoid stale provider references.
  • Added a temporary single-gateway sandbox object mutation lock for full-object sandbox writes touched by this PR. This serializes attach/detach, policy status, and policy backfill writes within one gateway process, but it is not HA-safe. General DB-backed CAS/resource-version object mutations are tracked separately in feat(gateway): add DB-backed resource_version CAS for stored objects #1255.
  • Updated test fake OpenShell services for the expanded generated trait.

UX Changes

New sandbox provider commands

  • openshell sandbox provider list [sandbox]
    • Lists provider records attached to a sandbox.
    • If sandbox is omitted, the CLI uses the last active sandbox, matching other sandbox read commands.
    • Output is a provider attachment table with provider name, type, credential key count, and config key count.
    • If nothing is attached, the CLI prints a no-provider message instead of an empty table.
  • openshell sandbox provider attach <sandbox> <provider>
    • Attaches an existing provider record to the sandbox.
    • Re-running the same command is idempotent and reports that the provider is already attached.
    • If the provider does not exist, the gateway rejects the request before mutating the sandbox.
  • openshell sandbox provider detach <sandbox> <provider>
    • Detaches a provider from the sandbox.
    • Re-running the same command is idempotent and reports that the provider was not attached.
    • Detach also removes that provider from the sandbox's future effective policy and provider environment resolution because both are derived from the current persisted attachment list.

Running sandbox behavior

  • Attach/detach affects future SSH/exec/SFTP launches after the sandbox poll loop observes the updated provider environment revision.
  • Already-running processes keep their existing process environment. This PR does not try to mutate live process env, which is not a supported OS primitive.
  • Existing placeholder generations are retained briefly so already-running processes can still resolve placeholders they received before a provider detach or credential rotation.

Effective policy behavior

  • When providers_v2_enabled is false, default behavior remains the existing provider functionality.
  • When providers_v2_enabled is true, the sandbox's effective policy is composed just-in-time from the current sandbox policy plus currently attached provider profiles.
  • Attach adds the provider profile policy layer to the next effective policy read.
  • Detach removes that provider profile policy layer from the next effective policy read.
  • Custom imported provider profiles participate in the same attach/detach composition path as built-in profiles.

Provider lifecycle guardrail

  • openshell provider delete <provider> now fails while that provider is attached to any sandbox.
  • Users must detach the provider from all sandboxes before deleting it.

Implementation Note: Object Write Locking

  • Sandbox records are currently stored as full protobuf payloads in the generic objects table, and several gateway paths perform read-modify-write updates against those sandbox objects.
  • This PR adds a process-local sandbox sync guard around the new attach/detach writes and related policy status/backfill writes so single-gateway deployments do not lose same-process sandbox object updates.
  • This lock is intentionally a short-term, single-gateway mitigation. It does not protect multi-gateway HA writers because separate gateway processes do not share the lock.
  • The broader fix is DB-backed compare-and-swap/resource-version support for generic object mutations, tracked in feat(gateway): add DB-backed resource_version CAS for stored objects #1255.

Testing

  • RUSTC_WRAPPER= cargo check -p openshell-server -p openshell-sandbox -p openshell-cli
  • RUSTC_WRAPPER= cargo test -p openshell-sandbox provider_credentials --lib
  • RUSTC_WRAPPER= cargo test -p openshell-server provider_env_revision_changes_when_attached_provider_record_changes --lib
  • RUSTC_WRAPPER= cargo test -p openshell-server delete_provider_rejects_attached_provider --lib
  • RUSTC_WRAPPER= cargo test -p openshell-server sandbox_config_and_provider_env_follow_attached_provider_lifecycle --lib
  • RUSTC_WRAPPER= cargo test -p openshell-server custom_imported_profile_policy_and_env_follow_attach_detach_lifecycle --lib
  • RUSTC_WRAPPER= cargo test -p openshell-server provider_environment_resolution_is_unchanged_by_providers_v2_setting --lib
  • RUSTC_WRAPPER= cargo test -p openshell-server scoped_access --lib
  • RUSTC_WRAPPER= cargo test -p openshell-sandbox provider_env_is_replaced_with_placeholders --lib
  • RUSTC_WRAPPER= cargo test -p openshell-cli sandbox_provider_subcommands_parse --bin openshell
  • RUSTC_WRAPPER= cargo test -p openshell-cli provider_attachment_table_formats_provider_counts --lib
  • RUSTC_WRAPPER= cargo test -p openshell-cli --test provider_commands_integration sandbox_provider
  • RUSTC_WRAPPER= cargo clippy -p openshell-server -p openshell-sandbox --lib --tests -- -D warnings
  • uv run ruff format --check e2e/python/test_sandbox_providers.py
  • uv run ruff check e2e/python/test_sandbox_providers.py
  • RUSTC_WRAPPER= mise run pre-commit

E2E note:

  • Added e2e/python/test_sandbox_providers.py::test_attach_detach_updates_credentials_for_later_exec_launches for live sandbox attach/detach behavior.
  • Local direct run was blocked by missing active gateway config.
  • Local Docker wrapper run built gateway/CLI, then stopped on the existing Darwin/arm64 requirement for a prebuilt Linux arm64 openshell-sandbox binary.

Checklist

  • Tests added or updated
  • Documentation intentionally excluded from scope per issue direction
  • No secrets or credentials committed

Closes #1171

Adds sandbox provider list, attach, and detach API/CLI support while keeping provider policy and credential resolution derived from current sandbox attachments.
Adds provider environment revisions and generation-scoped sandbox credential snapshots so future SSH and exec launches pick up provider attach, detach, and credential updates without mutating already-running processes.

Also blocks provider deletion while attached to prevent stale sandbox provider references.
@johntmyers johntmyers added test:e2e Requires end-to-end coverage labels May 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

Label test:e2e applied for cbeebc8. Open the existing run and click Re-run all jobs to execute with the label set. The E2E Gate check on this PR will flip green automatically once the run finishes.

1 similar comment
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

Label test:e2e applied for cbeebc8. Open the existing run and click Re-run all jobs to execute with the label set. The E2E Gate check on this PR will flip green automatically once the run finishes.

@johntmyers
Copy link
Copy Markdown
Collaborator Author

Smoke Test Results

Ran manual smoke tests against a local Docker-backed gateway (0.0.37-dev.138+gb49f84c7) to exercise the new provider attach/detach UX. All tests passed.

Test 1: Create-time provider composition

  • Created smoke-github (github) and smoke-generic (generic) providers
  • Created sandbox with --provider smoke-github --provider smoke-generic
  • sandbox provider list showed both providers with correct type and credential key counts
  • sandbox get showed _provider_smoke_github rule in the effective policy with api.github.com:443 and github.com:443 endpoints + gh/git binaries from the github profile
  • Generic provider correctly contributed no network rules (no built-in profile)

Test 2: Post-creation attach + policy update

  • Created sandbox with no providers; baseline policy had no _provider_ rules
  • sandbox provider attach succeeded; _provider_smoke_github immediately appeared in the effective policy
  • Idempotent re-attach correctly reported "already attached" without error
  • Attaching a non-existent provider returned FailedPrecondition with a clear message

Test 3: Env reloading (credential placeholders in new processes)

  • Created sandbox with no providers; printenv GITHUB_TOKEN via sandbox exec returned exit 1 (not set)
  • Attached smoke-github; after one poll cycle (~5s), new exec saw GITHUB_TOKEN=openshell:resolve:env:v<revision>_GITHUB_TOKEN
  • Attached smoke-generic; CUSTOM_API_KEY=openshell:resolve:env:v<revision>_CUSTOM_API_KEY appeared in new processes
  • Both placeholders shared the same revision, confirming a single provider env snapshot

Test 4: Detach removes policy rules + credential placeholders

  • Detached smoke-github; _provider_smoke_github rule immediately removed from effective policy
  • New sandbox exec immediately showed GITHUB_TOKEN absent from process env
  • CUSTOM_API_KEY still present with an updated revision (env revision changed due to the provider list mutation)
  • Detached smoke-generic; CUSTOM_API_KEY also removed from new process env
  • Idempotent detach correctly reported "was not attached" without error

Bonus: Provider delete guard

  • provider delete smoke-github while attached to two sandboxes returned FailedPrecondition identifying both blocking sandbox names
  • After deleting all sandboxes, provider delete succeeded

Environment

  • Gateway: docker-dev at http://127.0.0.1:18080
  • Version: 0.0.37-dev.138+gb49f84c7
  • Platform: macOS (darwin/arm64)

@johntmyers johntmyers added test:e2e Requires end-to-end coverage and removed test:e2e Requires end-to-end coverage labels May 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

Label test:e2e applied for dfbc35a. Open the existing run and click Re-run all jobs to execute with the label set. The E2E Gate check on this PR will flip green automatically once the run finishes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: support provider attach and detach for running sandboxes

2 participants