Host RPC timeout for environmentExecute ignores per-call timeoutMs (caps at 30s) #8
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
executePluginEnvironmentCommandinserver/src/services/plugin-environment-driver.tscalls the worker manager without forwardingparams.timeoutMs, so the host RPC timeout always falls back toDEFAULT_RPC_TIMEOUT_MS(30 s). Any environment-driverexecutecall that legitimately runs longer than 30 s — e.g. aclaudeinference exec via the K8s or e2b sandbox provider — is killed by the host even though the plugin's worker eventually returns the correct result.The worker manager already supports a per-call timeout (
callInternalatplugin-worker-manager.ts:1004-1044, withMath.min(timeoutMs ?? rpcTimeoutMs, MAX_RPC_TIMEOUT_MS)), so this is purely a missing forward at the call site.Symptom
Plugin-worker logs show:
i.e. the worker DID finish and respond, but the host had already abandoned the request 30 s in.
Affected
@farhoodlabs/paperclip-plugin-k8s(andpaperclip-adapter-claude-k8s) running realclaudeinference inside the lease pod@paperclipai/plugin-e2bwould hit this for any command that runs >30 s in the sandbox (same pattern: it declarestimeoutMsin config, defaults 300 000, threads it throughparams.timeoutMs ?? config.timeoutMsto the SDK call — but the host RPC timeout never sees that value)Reproduction
execTimeoutMs: 300000.claudeinference call (~25–35 s wall time).RPC call "environmentExecute" timed out after 30000mseven though the lease pod'sclaudeprocess exited cleanly with stdout.Minimal fix
server/src/services/plugin-environment-driver.ts:205(the line that compiles to the call below):PluginEnvironmentExecuteParams.timeoutMsalready exists in the protocol (packages/plugins/sdk/src/protocol.ts), andMAX_RPC_TIMEOUT_MS = 5 * 60 * 1000inplugin-worker-manager.ts:61will continue to enforce a sane upper bound. Plugins that don't pass atimeoutMskeep the existing 30 s default.Notes
executePluginEnvironment*—Acquire,Resume,RealizeWorkspace— for consistency, since they can also legitimately run > 30 s when an init container has to chown a large PVC or the sandbox provisions for the first time. Up to you whether to do them in the same PR.tarExcludeFlagsdiscussion insandbox-managed-runtime.ts; that flow is correct as designed.