diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ + diff --git a/README.md b/README.md index b7a5dda..a55e26b 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,138 @@ -# skills -Code Ocean skills +# Code Ocean Skills + +Skills for AI coding agents to interact with the [Code Ocean](https://codeocean.com) computational research platform. + +## Available Skills + +### `codeocean` + +Teaches AI agents how to use the Code Ocean API through three access methods: + +- **MCP Server** (26 tools) — primary interface for AI agents +- **Python SDK** (`codeocean` package) — for writing Python scripts +- **REST API** (curl/wget) — for shell commands and manual API calls + +Covers capsules, pipelines, computations, data assets, custom metadata, authentication, permissions, search, and pagination. + +## Prerequisites + +1. A Code Ocean account with API access +2. An API access token (generated from Account > Access Tokens) +3. One or more of: + - The Code Ocean MCP server (`codeocean-mcp-server`) for agent-based interaction + - The Python SDK (`pip install codeocean`) for programmatic access + - `curl` for direct REST API calls + +## Quick Start + +### 1. Generate an API Token + +1. Sign into your Code Ocean instance +2. Go to **Account > Access Tokens > Generate New Token** +3. Select scopes (Capsule Read/Write, Datasets Read/Write) +4. Copy the token immediately — it is only shown once + +### 2. Install the MCP Server (for AI agents) + +Install `uv` and Python 3.10+, then configure your agent. Example for Claude Desktop: + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://your-instance.codeocean.com", + "CODEOCEAN_TOKEN": "", + "AGENT_ID": "Claude Desktop" + } + } + } +} +``` + +See `codeocean/references/mcp-server-install.md` for configs for VS Code, Cline, Roo Code, Cursor, and Windsurf. + +### 3. Install the Skill + +- This skill lives in the [`codeocean/skills`](https://github.com/codeocean/skills) repository. +- The skill folder is at [``](https://github.com/codeocean/skills/tree/main/). The entry point is `SKILL.md`. +- **Install the entire skill folder**, not just `SKILL.md` — supporting files (references, templates, examples) are referenced by the entry point. +- The fastest cross-agent option is [`gh skill`](#github-cli-universal-installer) if available. + +| Agent | Install method | Exact path / command | Notes | +| ----------------------------------- | ------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Claude Code** | Manual folder copy | Project: `.claude/skills//`
User: `~/.claude/skills//`
Plugin: `/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). Claude watches these directories automatically. | +| **Codex** | A. Manual folder copy
B. Plugin install | A. `$CWD/.agents/skills//`
`$REPO_ROOT/.agents/skills//`
`$HOME/.agents/skills//`
`/etc/codex/skills//`
B. In-app: add from plugin directory
CLI: `/plugins` → Install plugin | Clone/download from `codeocean/skills`, copy the folder at ``. Codex supports skills natively; packaged distribution is often via plugins. | +| **Cursor** | Plugin-first | Install plugin from marketplace / team marketplace | No official direct raw GitHub skill install documented. Use `gh skill` row below for GitHub-based install. | +| **OpenCode** | Manual folder copy | `.opencode/skills//`
`~/.config/opencode/skills//`
Also compatible:
`.claude/skills//`
`~/.claude/skills//`
`.agents/skills//`
`~/.agents/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). | +| **Antigravity** | Manual folder copy | `.agents/skills//`
`~/.gemini/antigravity/skills//` | Defaults to `.agents/skills`. Copy full folder. | +| **Windsurf** | Manual folder copy | `.windsurf/skills//`
`~/.codeium/windsurf/skills//`
Enterprise: macOS `/Library/Application Support/Windsurf/skills/`, Linux/WSL `/etc/windsurf/skills/`, Windows `C:\ProgramData\Windsurf\skills\` | Each skill is a subdirectory containing `SKILL.md`. Copy full folder. | +| **GitHub Copilot CLI** | Manual folder copy | Project: `.github/skills//`, `.claude/skills//`, `.agents/skills//`
Personal: `~/.copilot/skills//`, `~/.claude/skills//`, `~/.agents/skills//` | Clone/download from `codeocean/skills`, copy the full folder at ``. | +| **VS Code / Copilot agent plugins** | Plugin from Git source | Run `Chat: Install Plugin From Source` → enter Git repo URL | This is for **plugins**, not raw skill folders. Applies only if the skill is wrapped as a plugin. Does not apply to raw skill repos like `codeocean/skills`. | +| **Gemini CLI** | Native GitHub install | `gemini skills install https://github.com/codeocean/skills.git --path `
`gemini skills install /path/to/local/ --scope workspace`
`gemini skills link /path/to/local/ --scope workspace` | Supports Git repo, local dir, zipped `.skill`, monorepo subpath, workspace/user scope. Use `--path` for monorepo subpath. | +| **Cline** | Manual folder copy | `.cline/skills//`
`~/.cline/skills//` | Enable Skills in Settings → Features → Enable Skills. Experimental. | +| **Kiro IDE** | Native GitHub import | Agent Steering & Skills → `+` → Import a skill → GitHub → paste `https://github.com/codeocean/skills/tree/main/` | URL must point to the subdirectory, not the repo root. Imported skills are copied into the skills directory. | +| **Kiro CLI** | Manual folder copy | `.kiro/skills//`
`~/.kiro/skills//` | Default agent auto-loads skills. Custom agents need `skill://` resources configured. | +| **`gh skill` (GitHub CLI)** | Universal GitHub install | `gh skill install codeocean/skills `
`gh skill install codeocean/skills --agent claude-code`
`gh skill install codeocean/skills --agent cursor`
`gh skill install codeocean/skills --agent codex`
`gh skill install codeocean/skills --agent gemini`
`gh skill install codeocean/skills --agent antigravity` | Installs to the correct host directory automatically. Can pin versions/commits. Cleanest cross-agent GitHub-hosted option. | + +#### Shared patterns + +- **Manual folder copy**: Claude Code, Codex, OpenCode, Antigravity, Windsurf, GitHub Copilot CLI, Cline, Kiro CLI — copy the skill directory (containing `SKILL.md` and supporting files) into the agent's watched skills path. +- **Native GitHub import/install**: Gemini CLI (`gemini skills install`), Kiro IDE (GitHub import UI) — install directly from `codeocean/skills` repo. +- **Plugin-first**: Cursor, VS Code / Copilot agent plugins — skill distribution is via marketplace or Git-source plugins, not raw skill folders. +- **Universal GitHub installer**: `gh skill install codeocean/skills ` — routes to the correct agent directory automatically; works across Claude Code, Codex, Cursor, Gemini, Antigravity. + +#### References + +- [Claude Code — Skills](https://code.claude.com/docs/en/skills) +- [Codex — Skills](https://developers.openai.com/codex/skills) +- [Codex — Plugins](https://developers.openai.com/codex/plugins) +- [OpenCode — Skills](https://opencode.ai/docs/skills) +- [Antigravity — Skills](https://antigravity.google/docs/skills) +- [Windsurf — Skills](https://docs.windsurf.com/windsurf/cascade/skills) +- [GitHub Copilot CLI — Skills](https://docs.github.com/en/copilot/how-tos/copilot-cli/customize-copilot/add-skills) +- [VS Code — Agent plugins](https://code.visualstudio.com/docs/copilot/customization/agent-plugins) +- [Gemini CLI — Skills](https://geminicli.com/docs/cli/skills/) +- [Cline — Skills](https://docs.cline.bot/customization/skills) +- [Kiro IDE — Skills](https://kiro.dev/docs/skills/) +- [Kiro CLI — Skills](https://kiro.dev/docs/cli/skills/) +- [`gh skill` — GitHub CLI](https://github.blog/changelog/2026-04-16-manage-agent-skills-with-github-cli/) + +## Skill Structure + +``` +codeocean/ +├── SKILL.md # Main skill — workflows, decision tree, concepts +└── references/ + ├── mcp-guide.md # MCP workflow patterns and anti-patterns + ├── mcp-tools-catalog.md # All 26 MCP tools with parameter schemas + ├── mcp-server-install.md # Install configs for 6 editors/agents + ├── cli-guide.md # curl/wget endpoint reference + ├── sdk-guide.md # Python SDK setup and examples + ├── setup-and-auth.md # Token generation and environment variables + ├── capsules.md # Capsule data model and operations + ├── pipelines.md # Pipeline-specific operations + ├── computations.md # Running, waiting, result retrieval + ├── data-assets.md # Data asset creation and lifecycle + ├── custom-metadata.md # Admin-defined metadata schema + ├── search-and-pagination.md # Query syntax and pagination + └── permissions.md # User/group access control +``` + +## How It Works + +The skill uses **progressive disclosure**: + +1. **SKILL.md** loads when the skill triggers — contains workflow patterns and a decision tree +2. **Reference files** load on demand — agents only read the files relevant to their current task + +This keeps context window usage efficient while providing deep coverage of the entire Code Ocean API. + +## Links + +- [Code Ocean User Guide](https://docs.codeocean.com/user-guide) — platform documentation +- [Code Ocean API Documentation](https://docs.codeocean.com/user-guide/code-ocean-api) — REST API reference +- [Code Ocean Python SDK](https://github.com/codeocean/codeocean-sdk-python) — GitHub repo +- [Code Ocean MCP Server](https://github.com/codeocean/codeocean-mcp-server) — GitHub repo diff --git a/codeocean/SKILL.md b/codeocean/SKILL.md new file mode 100644 index 0000000..8078710 --- /dev/null +++ b/codeocean/SKILL.md @@ -0,0 +1,180 @@ +--- +name: codeocean +description: "Guide for interacting with the Code Ocean computational research platform via its MCP server (25 tools), Python SDK, and REST API (curl). Use when an agent or user needs to: (1) search, run, or manage capsules and pipelines, (2) manage computations and retrieve results, (3) create, search, or manage data assets, (4) write Python scripts using the codeocean SDK, (5) guide users through curl/REST API calls to Code Ocean, (6) set up Code Ocean API authentication and MCP server configuration, or (7) orchestrate end-to-end computational workflows on Code Ocean." +--- + +# Code Ocean Skill + +## 1. Overview + +Code Ocean resources covered by this skill: + +- **Capsules**: runnable computational units. +- **Pipelines**: multi-step workflows with their own `/pipelines/...` API surface. +- **Computations**: capsule or pipeline runs. +- **Data Assets**: immutable datasets, results, combined assets, and models. +- **Custom Metadata**: deployment-defined schema for data assets. + +Three access methods: + +| Method | When to use | Reference | +| ---------------------------- | -------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ | +| **MCP Server** (25 tools) | Primary for agentic workflows | [mcp-guide.md](references/mcp-guide.md), [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | +| **Python SDK** (`codeocean`) | Python scripts and typed integrations | [sdk-guide.md](references/sdk-guide.md) | +| **REST API** (curl) | Shell automation and raw HTTP | [cli-guide.md](references/cli-guide.md) | +| **Setup/Auth** | Token generation, compatibility, MCP install | [setup-and-auth.md](references/setup-and-auth.md), [mcp-server-install.md](references/mcp-server-install.md) | +| **Errors** | Interpreting failures and choosing the next action | [errors-http-and-sdk.md](references/errors-http-and-sdk.md), [errors-resource-states.md](references/errors-resource-states.md) | +| **User Guide Concepts** | Short product-level meanings from the user guide | [user-guide/](references/user-guide/) | + +## 2. Core Workflows + +### Workflow 1: Find and Run a Capsule via MCP + +1. `search_capsules(search_params={...})` +2. `get_capsule(capsule_id)` if you need full metadata +3. `get_capsule_app_panel(capsule_id)` before running +4. `run_capsule(run_params={...})` +5. `wait_until_completed(computation_id)` +6. `list_computation_results(computation_id)` +7. `get_result_file_urls(...)` or `download_and_read_a_file_from_computation(...)` + +Minimal run payload: + +```json +{ + "capsule_id": "", + "data_assets": [{ "id": "", "mount": "" }], + "parameters": ["value1", "value2"], + "named_parameters": [{ "param_name": "threshold", "value": "0.5" }] +} +``` + +### Workflow 2: Capture Results as a Data Asset + +1. `create_data_asset(data_asset_params={...})` +2. `wait_until_ready(data_asset_object)` +3. `get_data_asset(data_asset_id)` if you need the refreshed object + +`wait_until_ready` takes the full `DataAsset` object in both MCP and SDK, not just the ID. + +### Workflow 3: Run a Pipeline + +Current MCP support is partial: + +1. `search_pipelines(search_params={...})` to find the pipeline +2. Use SDK or REST for guaranteed pipeline metadata/app-panel access via `/pipelines/...` +3. `run_capsule(run_params={pipeline_id: ..., processes: [...]})` +4. `wait_until_completed(...)` +5. `list_computation_results(...)` + +Pipeline run payload: + +```json +{ + "pipeline_id": "", + "data_assets": [{ "id": "", "mount": "" }], + "processes": [ + { "name": "step1", "parameters": ["val1"] }, + { + "name": "step2", + "named_parameters": [{ "param_name": "k", "value": "v" }] + } + ] +} +``` + +### Workflow 4: Explore Data Assets + +1. `search_data_assets(search_params={query: "...", type: "dataset"})` +2. `get_data_asset(data_asset_id)` +3. `list_data_asset_files(data_asset_id, path="")` +4. `get_data_asset_file_urls(...)` or `download_and_read_a_file_from_data_asset(...)` + +### Workflow 5: Attach and Detach Data Assets + +| Context | Attach | Detach | +| ----------------------------- | --------------------------------------------------------------------- | ------------------------------------------------------------------- | +| Capsule | `attach_data_assets(capsule_id, attach_params=[...])` | `detach_data_assets(capsule_id, data_assets=[...])` | +| Cloud workstation computation | `attach_computation_data_assets(computation_id, attach_params=[...])` | `detach_computation_data_assets(computation_id, data_assets=[...])` | + +Attach expects objects like `{id, mount?}`. Detach expects plain ID strings. + +## 3. Key Concepts + +### RunParams + +Key fields: `capsule_id` or `pipeline_id`, plus `data_assets`, `parameters`, `named_parameters`, and `processes` (for pipelines). Verify exact fields against the installed SDK version or live MCP tool schema. + +### Search + +`query` supports free text plus `field:value` filters. Structured filters like `type`, `ownership`, `origin` are separate params — do not put them inside the `query` string. See [search-and-pagination.md](references/search-and-pagination.md) for patterns. + +### MCP Compact Search Format + +MCP search responses use abbreviated field names (`n`=name, `d`=description, `t`=tags, `s`=slug). Descriptions are truncated to 200 characters. Tags are limited to 10 entries. Pass `include_field_names=true` to get the mapping. + +### File Reading Limit + +The MCP `download_and_read_*` helpers read and decode the first `50_000` bytes of the remote file response. Use `get_*_file_urls` when you need the complete file. + +### Error Interpretation + +Code Ocean failures come from two different layers: + +- HTTP/API failures: `400`, `401`, `403`, `404`, `429`, `5xx`, SDK `Error`, curl non-2xx responses +- Resource-state failures: a computation or data asset request succeeds, but the returned object is in a failed terminal state + +When an agent sees an error, it should first classify it: + +- If the request itself failed, load [errors-http-and-sdk.md](references/errors-http-and-sdk.md) +- If the request succeeded but the resource state is bad, load [errors-resource-states.md](references/errors-resource-states.md) + +### Compatibility Note + +Public docs, the installed SDK, and the MCP server may target different Code Ocean versions. When they disagree, prefer the locally installed SDK/MCP source for current callable names and payload shapes, and mention the mismatch explicitly. Check `pip show codeocean` and the MCP server's live tool schemas for the ground truth. + +## 4. Reference Index + +| File | Load when... | +| ----------------------------------------------------------------------------- | ----------------------------------------------------------- | +| [mcp-guide.md](references/mcp-guide.md) | MCP workflow patterns and caveats | +| [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | MCP tool grouping and anti-patterns (live schemas are truth)| +| [sdk-guide.md](references/sdk-guide.md) | Python SDK setup, patterns, and examples | +| [cli-guide.md](references/cli-guide.md) | curl route patterns and payloads | +| [capsules.md](references/capsules.md) | Capsule workflows, search, app panel, permissions | +| [pipelines.md](references/pipelines.md) | Pipeline-specific SDK/REST guidance | +| [computations.md](references/computations.md) | Run workflows, polling, results, cloud workstations | +| [data-assets.md](references/data-assets.md) | Data asset workflows and lifecycle | +| [search-and-pagination.md](references/search-and-pagination.md) | Query syntax and pagination patterns | +| [permissions.md](references/permissions.md) | Permissions patterns and routes | +| [custom-metadata.md](references/custom-metadata.md) | Custom metadata usage patterns | +| [errors-http-and-sdk.md](references/errors-http-and-sdk.md) | HTTP status codes, SDK `Error`, retry meaning | +| [errors-resource-states.md](references/errors-resource-states.md) | Failed computations/data assets and how to read them | +| [setup-and-auth.md](references/setup-and-auth.md) | Tokens, env vars, compatibility | +| [mcp-server-install.md](references/mcp-server-install.md) | MCP server installation | +| [user-guide/capsule.md](references/user-guide/capsule.md) | Product-level definition of a Capsule | +| [user-guide/pipeline.md](references/user-guide/pipeline.md) | Product-level definition of a Pipeline | +| [user-guide/data-asset.md](references/user-guide/data-asset.md) | Product-level definition of a Data Asset | +| [user-guide/computation.md](references/user-guide/computation.md) | Product-level definition of a Computation | +| [user-guide/cloud-workstation.md](references/user-guide/cloud-workstation.md) | Product-level definition of a Cloud Workstation | +| [user-guide/reproducible-run.md](references/user-guide/reproducible-run.md) | Product-level definition of a Reproducible Run | + +## 5. Version Awareness + +Code Ocean is deployed at different versions across customer environments. The data model (field names, enum values, method signatures) can change between releases. + +**Rules for agents:** + +- **MCP tools**: tool schemas come from the running MCP server at connection time. Use those live schemas as the source of truth for tool names, parameter names, and types. Do not rely on hardcoded field lists in this skill. +- **Python SDK**: if you need to verify a model's fields or an import path, inspect the installed SDK source rather than trusting this skill's examples. Run `python -c "import codeocean; print(codeocean.__version__)"` to check the version, and read the SDK model files directly if needed. +- **REST API**: route structure (`/api/v1/...`) and auth patterns are stable. For exact field-level details, consult the user guide for the customer's deployed version. +- **User Guide**: the canonical reference for the customer's version is always [docs.codeocean.com/user-guide](https://docs.codeocean.com/user-guide). When in doubt about a field, enum, or behavior, check there first. + +This skill provides **workflow patterns, error interpretation, gotchas, and integration guidance** — not a substitute for version-specific API documentation. + +## 6. External Links + +- [Code Ocean User Guide](https://docs.codeocean.com/user-guide) +- [Code Ocean API Reference](https://docs.codeocean.com/user-guide/code-ocean-api) +- [Code Ocean MCP Server](https://github.com/codeocean/codeocean-mcp-server) +- [Code Ocean Python SDK](https://github.com/codeocean/codeocean-sdk-python) diff --git a/codeocean/references/capsules.md b/codeocean/references/capsules.md new file mode 100644 index 0000000..662cae5 --- /dev/null +++ b/codeocean/references/capsules.md @@ -0,0 +1,129 @@ +# Capsules Reference + +For the full field-level Capsule data model, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/capsule) or inspect the installed SDK source. This file covers workflow patterns and gotchas. + +## Search + +MCP: + +```text +search_capsules(search_params={query: "RNA-seq", limit: 10}) +``` + +SDK: + +```python +from codeocean.capsule import CapsuleSearchParams + +results = client.capsules.search_capsules( + CapsuleSearchParams(query="RNA-seq", limit=10) +) +``` + +Query supports `field:value` syntax (e.g., `name:`, `tag:`, `doi:`). Structured params like `ownership`, `status`, `favorite`, `archived`, `sort_field`, `sort_order`, and `filters` are separate from `query` — do not embed them in the query string. + +## Get Capsule + +SDK: + +```python +capsule = client.capsules.get_capsule(capsule_id) +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +MCP: + +```text +get_capsule(capsule_id) +``` + +## App Panel + +SDK: + +```python +app_panel = client.capsules.get_capsule_app_panel(capsule_id, version=None) +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/capsules/$CAPSULE_ID/app_panel" +``` + +The app panel describes the capsule's runnable interface: general info, data asset slots, parameter definitions, and result structure. Always check the app panel before running a capsule to understand required inputs. + +## Computations for a Capsule + +SDK: + +```python +computations = client.capsules.list_computations(capsule_id) +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/capsules/$CAPSULE_ID/computations" +``` + +MCP: + +```text +list_computations(capsule_id) +``` + +## Attach and Detach Data Assets + +Attach expects objects with `id` (and optional `mount`). Detach expects plain ID strings. + +```python +from codeocean.data_asset import DataAssetAttachParams + +client.capsules.attach_data_assets( + capsule_id, + [DataAssetAttachParams(id="data-asset-uuid", mount="input")], +) +``` + +```python +client.capsules.detach_data_assets(capsule_id, ["data-asset-uuid"]) +``` + +REST routes: + +- `POST /capsules/{id}/data_assets` with a list of attach objects +- `DELETE /capsules/{id}/data_assets` with a list of ID strings + +## Permissions + +Permissions are SDK/REST only, not MCP. + +SDK methods: + +- `client.capsules.get_permissions(capsule_id)` +- `client.capsules.update_permissions(capsule_id, permissions)` + +REST routes: + +- `GET /capsules/{id}/permissions` +- `POST /capsules/{id}/permissions` + +For permission model types and roles, see [permissions.md](permissions.md). + +## Archive and Delete + +SDK: + +- `client.capsules.archive_capsule(capsule_id, archive=True)` +- `client.capsules.delete_capsule(capsule_id)` + +REST: + +- `PATCH /capsules/{id}/archive?archive=true` +- `DELETE /capsules/{id}` diff --git a/codeocean/references/cli-guide.md b/codeocean/references/cli-guide.md new file mode 100644 index 0000000..b10924d --- /dev/null +++ b/codeocean/references/cli-guide.md @@ -0,0 +1,356 @@ +# Code Ocean CLI Guide + +Base URL: `https://{domain}/api/v1/` + +## Authentication + +Code Ocean uses HTTP Basic Auth with the access token as the username and an empty password. + +```bash +export CODEOCEAN_DOMAIN="https://codeocean.acme.com" +export CODEOCEAN_TOKEN="cop_xxxxx" + +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/custom_metadata" +``` + +For JSON request bodies, add `-H "Content-Type: application/json"`. + +For status-code meanings and how to react to failures, also load [errors-http-and-sdk.md](errors-http-and-sdk.md). + +## Capsules + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"my capsule\" tag:genomics","limit":20}' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/search" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/app_panel" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/computations" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '[{"id":"DATA_ASSET_ID"}]' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/data_assets" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + -H "Content-Type: application/json" \ + -d '["DATA_ASSET_ID"]' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/data_assets" +``` + +Permissions: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/permissions" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X POST \ + -H "Content-Type: application/json" \ + -d '{"users":[{"email":"user@example.com","role":"editor"}],"everyone":"discoverable","share_assets":true}' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/permissions" +``` + +Archive: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PATCH \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/archive?archive=true" +``` + +Delete: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +## Pipelines + +Pipeline routes use `/pipelines/...`. + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"my pipeline\"","limit":20}' \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/search" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/app_panel" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/computations" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '[{"id":"DATA_ASSET_ID"}]' \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/data_assets" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + -H "Content-Type: application/json" \ + -d '["DATA_ASSET_ID"]' \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/data_assets" +``` + +## Computations + +Run a capsule: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "capsule_id":"CAPSULE_ID", + "parameters":["value1","value2"], + "named_parameters":[{"param_name":"threshold","value":"0.5"}], + "data_assets":[{"id":"DATA_ASSET_ID","mount":"input_data"}] + }' \ + "$CODEOCEAN_DOMAIN/api/v1/computations" +``` + +Run a pipeline: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "pipeline_id":"PIPELINE_ID", + "processes":[ + {"name":"process_name","parameters":["value1"]}, + {"name":"process_name_2","named_parameters":[{"param_name":"param1","value":"value1"}]} + ] + }' \ + "$CODEOCEAN_DOMAIN/api/v1/computations" +``` + +Get a computation: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID" +``` + +List results uses `POST` with a JSON body containing `path`: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"path":""}' \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/results" +``` + +Get result file URLs: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/results/urls?path=output.csv" +``` + +Rename: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PATCH \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID?name=my-analysis-run" +``` + +Delete: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID" +``` + +Cloud workstation data assets: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '[{"id":"DATA_ASSET_ID","mount":"work"}]' \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/data_assets" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + -H "Content-Type: application/json" \ + -d '["DATA_ASSET_ID"]' \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/data_assets" +``` + +## Data Assets + +Search: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"reference genome\"","type":"dataset","limit":20}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/search" +``` + +Get: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID" +``` + +Create from computation results: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "name":"Analysis Results", + "mount":"analysis_results", + "tags":["rna-seq","results"], + "description":"Output from RNA-seq pipeline", + "source":{"computation":{"id":"COMPUTATION_ID"}} + }' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets" +``` + +Create from S3: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "name":"External Reference Data", + "mount":"reference", + "tags":["reference"], + "description":"Reference genome from S3", + "source":{"aws":{"bucket":"my-bucket","prefix":"reference/genome/"}} + }' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets" +``` + +Update metadata: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PUT \ + -H "Content-Type: application/json" \ + -d '{"name":"Updated Name","description":"Updated description","tags":["updated-tag"]}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID" +``` + +List files uses `POST` with `{"path": ...}`: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"path":""}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/files" +``` + +Get file URLs: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/files/urls?path=data.csv" +``` + +Permissions: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/permissions" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X POST \ + -H "Content-Type: application/json" \ + -d '{"users":[{"email":"user@example.com","role":"editor"}],"groups":[{"group":"research-team","role":"viewer"}],"everyone":"discoverable","share_assets":true}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/permissions" +``` + +Archive: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PATCH \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/archive?archive=true" +``` + +Delete: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID" +``` + +Transfer: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X POST \ + -H "Content-Type: application/json" \ + -d '{"target":{"aws":{"bucket":"new-bucket","prefix":"new-prefix/"}},"force":false}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/transfer" +``` + +## Shell Workflow Example + +```bash +#!/usr/bin/env bash +set -euo pipefail + +DOMAIN="$CODEOCEAN_DOMAIN" +TOKEN="$CODEOCEAN_TOKEN" + +CAPSULE_ID=$(curl -s -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"RNA-seq Analysis\"","limit":1}' \ + "$DOMAIN/api/v1/capsules/search" | jq -r '.results[0].id') + +COMPUTATION_ID=$(curl -s -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d "{\"capsule_id\":\"$CAPSULE_ID\",\"parameters\":[\"hg38\"]}" \ + "$DOMAIN/api/v1/computations" | jq -r '.id') + +while true; do + STATE=$(curl -s -u "$TOKEN:" "$DOMAIN/api/v1/computations/$COMPUTATION_ID" | jq -r '.state') + if [ "$STATE" = "completed" ] || [ "$STATE" = "failed" ]; then + break + fi + sleep 5 +done + +curl -s -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"path":""}' \ + "$DOMAIN/api/v1/computations/$COMPUTATION_ID/results" | jq '.items[].path' + +URL=$(curl -s -u "$TOKEN:" \ + "$DOMAIN/api/v1/computations/$COMPUTATION_ID/results/urls?path=output.csv" | jq -r '.download_url') +curl -o output.csv "$URL" +``` diff --git a/codeocean/references/computations.md b/codeocean/references/computations.md new file mode 100644 index 0000000..14159ec --- /dev/null +++ b/codeocean/references/computations.md @@ -0,0 +1,88 @@ +# Computations Reference + +For the full field-level Computation data model, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/computation) or inspect the installed SDK source. This file covers workflow patterns and gotchas. + +## Running + +MCP: + +```text +run_capsule({ + capsule_id: "capsule-uuid", + parameters: ["val1", "val2"], + data_assets: [{id: "data-uuid", mount: "data"}] +}) +``` + +SDK: + +```python +from codeocean.computation import RunParams, DataAssetsRunParam + +computation = client.computations.run_capsule( + RunParams( + capsule_id="capsule-uuid", + parameters=["val1", "val2"], + data_assets=[DataAssetsRunParam(id="data-uuid", mount="data")], + ) +) +``` + +Pipeline runs also use `run_capsule(...)`, with `pipeline_id` instead of `capsule_id`. + +## Waiting for Completion + +MCP: + +```text +wait_until_completed(computation_id) +``` + +SDK: + +```python +completed = client.computations.wait_until_completed( + computation, + polling_interval=5, + timeout=300, +) +``` + +**Gotcha — SDK vs MCP difference**: the SDK method takes a full `Computation` object; the MCP tool takes a `computation_id` (it internally fetches the object first). Minimum polling interval is `5`. + +**Gotcha — terminal does not mean successful**: after polling, always inspect `state`, `end_status`, and `exit_code`. See [errors-resource-states.md](errors-resource-states.md) for interpretation. + +## Result Files + +List results: + +- SDK: `client.computations.list_computation_results(computation_id, path="")` +- REST: `POST /computations/{id}/results` with `{"path": ""}` +- MCP: `list_computation_results(computation_id)` + +Get file URLs: + +- SDK: `client.computations.get_result_file_urls(computation_id, path)` +- REST: `GET /computations/{id}/results/urls?path=...` +- MCP: `get_result_file_urls(computation_id, file_path)` +- Return shape: `{download_url, view_url}` + +Read file content: + +- MCP: `download_and_read_a_file_from_computation(computation_id, file_path)` +- Reads and decodes the first `50_000` bytes only + +## Rename and Delete + +- SDK: `rename_computation(computation_id, name)`, `delete_computation(computation_id)` +- REST: `PATCH /computations/{id}?name=...`, `DELETE /computations/{id}` + +## Cloud Workstation Data Assets + +Use computation-level attach/detach APIs for cloud workstation sessions: + +- SDK: `client.computations.attach_data_assets(computation_id, attach_params)`, `client.computations.detach_data_assets(computation_id, data_assets)` +- REST: `POST /computations/{id}/data_assets`, `DELETE /computations/{id}/data_assets` +- MCP: `attach_computation_data_assets(...)`, `detach_computation_data_assets(...)` + +**Gotcha**: attach expects objects with `{id, mount?}`. Detach expects plain ID strings. diff --git a/codeocean/references/custom-metadata.md b/codeocean/references/custom-metadata.md new file mode 100644 index 0000000..bfb2d77 --- /dev/null +++ b/codeocean/references/custom-metadata.md @@ -0,0 +1,49 @@ +# Custom Metadata Reference + +Custom metadata is a deployment-defined schema that governs which metadata fields are available on data assets. For exact type definitions and field names, consult the installed SDK source or the [user guide](https://docs.codeocean.com/user-guide). + +## Get Schema + +SDK: + +```python +schema = client.custom_metadata.get_custom_metadata() +``` + +MCP: + +```text +get_custom_metadata() +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/custom_metadata" +``` + +Always fetch the schema first to understand what metadata fields are available in the current deployment. + +## Using Custom Metadata on Data Assets + +Custom metadata values are supplied inside the `custom_metadata` dict when creating or updating a data asset. Value types match the field definitions from the schema. + +Create example: + +```text +create_data_asset({ + name: "Sample Dataset", + mount: "sample_dataset", + tags: ["experiment", "genomics"], + source: {computation: {id: "comp-uuid"}}, + custom_metadata: {"species": "human", "sample_count": 42} +}) +``` + +Update example: + +```text +update_metadata(data_asset_id, { + custom_metadata: {"species": "mouse", "experiment_date": 1700000000} +}) +``` diff --git a/codeocean/references/data-assets.md b/codeocean/references/data-assets.md new file mode 100644 index 0000000..416bb51 --- /dev/null +++ b/codeocean/references/data-assets.md @@ -0,0 +1,124 @@ +# Data Assets Reference + +For the full field-level DataAsset data model, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/data-asset) or inspect the installed SDK source. This file covers workflow patterns and gotchas. + +## Search + +MCP: + +```text +search_data_assets(search_params={query: "genomics", type: "dataset", limit: 10}) +``` + +SDK: + +```python +from codeocean.data_asset import DataAssetSearchParams, DataAssetType + +results = client.data_assets.search_data_assets( + DataAssetSearchParams(query="genomics", type=DataAssetType.Dataset, limit=10) +) +``` + +Query supports `field:value` syntax (e.g., `name:`, `tag:`). The `type`, `origin`, and `ownership` filters are structured params — do not embed them in the query string. + +## Files and URLs + +Get: + +- SDK: `client.data_assets.get_data_asset(data_asset_id)` +- REST: `GET /data_assets/{id}` +- MCP: `get_data_asset(data_asset_id)` + +List files: + +- SDK: `client.data_assets.list_data_asset_files(data_asset_id, path="")` +- REST: `POST /data_assets/{id}/files` with `{"path": ""}` +- MCP: `list_data_asset_files(data_asset_id, path="")` + +Get file URLs: + +- SDK: `client.data_assets.get_data_asset_file_urls(data_asset_id, path)` +- REST: `GET /data_assets/{id}/files/urls?path=...` +- MCP: `get_data_asset_file_urls(data_asset_id, file_path)` +- Return shape: `{download_url, view_url}` + +Read file content: + +- MCP: `download_and_read_a_file_from_data_asset(data_asset_id, file_path)` +- Reads and decodes the first `50_000` bytes only — use `get_data_asset_file_urls` when you need the complete file + +## Creating Data Assets + +The SDK requires `name`, `tags`, and `mount` at minimum. The `source` field specifies where data comes from (computation results, S3, GCS, cloud workstation). For exact field names, check the installed SDK version. + +Example from computation results: + +```python +from codeocean.data_asset import DataAssetParams, Source, ComputationSource + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="My Results", + mount="my_results", + tags=["results"], + source=Source(computation=ComputationSource(id="comp-uuid")), + ) +) +``` + +Combined data asset example: + +```python +from codeocean.data_asset import DataAssetParams + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="Combined Dataset", + mount="combined_dataset", + tags=["combined"], + data_asset_ids=["id1", "id2"], + ) +) +``` + +## Wait Until Ready + +SDK: + +```python +ready_asset = client.data_assets.wait_until_ready(data_asset, polling_interval=5, timeout=300) +``` + +MCP: + +```text +wait_until_ready(data_asset_object, polling_interval=5, timeout=None) +``` + +**Gotcha**: both SDK and MCP require the full `DataAsset` object, not just the ID. Minimum polling interval is `5`. + +## Update Metadata + +- SDK: `client.data_assets.update_metadata(data_asset_id, update_params)` +- REST: `PUT /data_assets/{id}` +- MCP: `update_metadata(data_asset_id, update_params)` + +## Permissions, Archive, Delete + +Permissions: + +- SDK: `get_permissions`, `update_permissions` +- REST: `GET /data_assets/{id}/permissions`, `POST /data_assets/{id}/permissions` + +Archive/delete: + +- SDK: `archive_data_asset(data_asset_id, archive=True)`, `delete_data_asset(data_asset_id)` +- REST: `PATCH /data_assets/{id}/archive?archive=true`, `DELETE /data_assets/{id}` + +## Transfer + +Admin-only: + +- SDK: `client.data_assets.transfer_data_asset(data_asset_id, transfer_params)` +- REST: `POST /data_assets/{id}/transfer` diff --git a/codeocean/references/errors-http-and-sdk.md b/codeocean/references/errors-http-and-sdk.md new file mode 100644 index 0000000..4cb12e3 --- /dev/null +++ b/codeocean/references/errors-http-and-sdk.md @@ -0,0 +1,200 @@ +# Errors: HTTP and SDK + +Use this reference when the request itself failed: curl returned a non-2xx response, the SDK raised `codeocean.Error`, or the MCP tool call failed before returning a resource object. + +## 1. Source of Truth + +Official API errors documentation says Code Ocean uses conventional HTTP response codes: + +- `2xx`: success +- `4xx`: request failed given the information provided +- `5xx`: server-side failure + +The official error summaries are: + +- `200 OK`: request succeeded +- `204 No Content`: request succeeded with no body +- `400 Bad Request`: missing required parameter, misspelled field, or bad format +- `401 Unauthorized`: no valid access token provided +- `403 Forbidden`: token lacks permission for the request +- `404 Not Found`: requested resource does not exist +- `429 Too Many Requests`: Computation API may be overloaded; back off before retrying +- `500`, `502`, `503`, `504`: Code Ocean server issue + +Sources: + +- +- + +## 2. What the Python SDK Raises + +The SDK wraps HTTP failures in `codeocean.Error`. Key attributes include `status_code`, `message`, and `data` (parsed JSON body). Verify available attributes against the installed SDK version. + +Meaning for agents: + +- Prefer `e.message` as the user-facing explanation +- Inspect `e.data` for structured API details +- Use `e.status_code` to choose the next action + +Example: + +```python +from codeocean import CodeOcean, Error + +try: + client = CodeOcean(domain="https://codeocean.acme.com", token="cop_xxxxx") + capsule = client.capsules.get_capsule("bad-id") +except Error as e: + print(e.status_code) + print(e.message) + print(e.data) +``` + +## 3. What Each Error Usually Means + +### `400 Bad Request` + +What it means from the docs: + +- the request shape is invalid +- a required field is missing +- a field name is wrong +- a value is badly formatted + +How to interpret it in practice: + +- wrong JSON shape +- wrong enum value +- using query fields where structured params are required +- sending IDs where an object list is required, or vice versa +- omitting SDK-required fields such as `name`, `tags`, or `mount` in local `DataAssetParams` + +Typical fix: + +- re-check the exact method/tool signature +- compare the body to the model fields in the SDK +- compare attach vs detach payload shapes + +### `401 Unauthorized` + +What it means from the docs: + +- no valid access token provided + +How an agent should explain it: + +- the token is missing, invalid, malformed, or revoked +- in curl, Basic Auth may be wrong +- in MCP/server config, `CODEOCEAN_TOKEN` may not be set correctly + +Typical fix: + +- verify the token exists +- verify Basic Auth uses the token as username and no password +- regenerate the token if it may have been lost or revoked + +### `403 Forbidden` + +What it means from the docs: + +- the token does not have permission to perform the request + +How an agent should explain it: + +- the token scope is too narrow +- the user can authenticate, but lacks access to this resource +- the user may have read access where write access is required + +Typical fix: + +- check token scopes +- check resource permissions +- for data assets, ensure Datasets scope is present +- for capsules/computations, ensure Capsule scope is present + +### `404 Not Found` + +What it means from the docs: + +- the requested resource does not exist + +How an agent should explain it: + +- the ID may be wrong +- the path may be wrong +- the resource may exist but not be accessible to this token + +Typical fix: + +- verify the ID came from a fresh search/get response +- verify file paths for `.../results/urls` or `.../files/urls` +- confirm the token can access the resource + +### `429 Too Many Requests` + +What it means from the docs: + +- the Computation API may be overloaded + +How an agent should explain it: + +- the issue is load or rate pressure, not bad input + +Typical fix: + +- retry with backoff +- reduce polling frequency +- avoid tight retry loops + +### `500`, `502`, `503`, `504` + +What it means from the docs: + +- Code Ocean servers have an issue + +How an agent should explain it: + +- the request may be correct, but the platform failed to process it + +Typical fix: + +- retry later +- retry with backoff if the operation is idempotent or safe to repeat +- avoid rewriting the payload unless there is separate evidence the request shape is wrong + +## 4. MCP-Specific Error Meaning + +MCP tool failures can come from three places: + +1. the underlying API returned an HTTP error +2. the SDK raised `codeocean.Error` +3. the MCP helper itself failed locally + +Special case from the local MCP server: + +- `download_and_read_a_file_from_computation(...)` +- `download_and_read_a_file_from_data_asset(...)` + +These helpers call a local downloader that catches `requests` exceptions and returns a string starting with `Download error:` instead of raising. + +Meaning: + +- if the tool returns text beginning with `Download error:`, treat it as a transport/download failure, not as successful file content + +## 5. Agent Decision Rules + +When you see an HTTP or SDK error, explain both: + +- what the status code means according to the docs +- what it most likely means in the current request + +Recommended mapping: + +- `400`: request shape problem, field/value mismatch, or missing required input +- `401`: auth/token problem +- `403`: scope/permission problem +- `404`: wrong ID/path or inaccessible resource +- `429`: retry later with backoff +- `5xx`: server/platform problem; retry later + +Do not confuse these with resource-state failures. If the API call succeeded and returned a computation or data asset object, use [errors-resource-states.md](errors-resource-states.md) instead. diff --git a/codeocean/references/errors-resource-states.md b/codeocean/references/errors-resource-states.md new file mode 100644 index 0000000..cd856cb --- /dev/null +++ b/codeocean/references/errors-resource-states.md @@ -0,0 +1,180 @@ +# Errors: Resource States + +Use this reference when the API request itself succeeded, but the returned resource is in a failed or unusable state. + +This is different from HTTP/API errors: + +- request failure: use [errors-http-and-sdk.md](errors-http-and-sdk.md) +- returned object in bad state: use this file + +## 1. Computation Failures + +Key fields to inspect: `state`, `end_status`, `exit_code`, `has_results`. For exact enum values, check the installed SDK version or the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/computation). + +### How to interpret them + +#### `state=failed` + +Meaning: + +- the run did not complete successfully +- this is already a terminal failure at the computation lifecycle level + +Agent explanation: + +- the API call worked, but Code Ocean reports that the run itself failed + +#### `state=completed` and `end_status=succeeded` + +Meaning: + +- normal successful completion + +#### `state=completed` and `end_status=failed` + +Meaning: + +- the run reached a terminal state, but the job outcome was failure + +Agent explanation: + +- the scheduler/run lifecycle finished, but the actual computation outcome is failure + +#### `state=completed` and `end_status=stopped` + +Meaning: + +- the run did not fail by computation error; it was stopped or terminated + +Agent explanation: + +- the computation ended early because it was stopped, deleted while running, or otherwise interrupted + +#### Non-zero `exit_code` + +Meaning: + +- the executed process ended unsuccessfully + +Agent explanation: + +- the run reached the end of execution, but the underlying process returned a failure code + +### What the agent should do + +If a computation is not successful: + +1. report `state`, `end_status`, and `exit_code` if present +2. distinguish between platform request failure and run failure +3. do not say “the API failed” when the object was returned successfully +4. check `has_results` before assuming outputs exist + +Suggested wording: + +- “The request succeeded, but the computation ended with `end_status=failed`.” +- “The computation object exists, but the run was `stopped`, so results may be incomplete or absent.” + +## 2. Data Asset Failures + +Key fields to inspect: `state`, `failure_reason`, `files`, `size`. For exact enum values, check the installed SDK version or the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/data-asset). + +### How to interpret them + +#### `state=draft` + +Meaning: + +- creation/indexing is still in progress + +Agent explanation: + +- the data asset exists, but it is not ready for normal file operations yet + +#### `state=ready` + +Meaning: + +- the asset is ready to use + +#### `state=failed` + +Meaning: + +- Code Ocean accepted the creation request, but asset creation failed later + +Agent explanation: + +- this is not a bad API request anymore; it is a failed asynchronous creation outcome +- `failure_reason` is the first field to inspect + +### `failure_reason` + +Meaning: + +- the platform’s explanation for why data asset creation failed + +Agent behavior: + +- if present, surface it directly as the primary explanation +- do not replace it with a generic guess unless it is empty + +Suggested wording: + +- “The create request was accepted, but the data asset later entered `failed` state. `failure_reason` says: ...” + +## 3. Polling Semantics + +The SDK polling helpers reflect resource-state outcomes: + +- `client.computations.wait_until_completed(computation, ...)` +- `client.data_assets.wait_until_ready(data_asset, ...)` + +Important nuance: + +- these methods can return a terminal object that is unsuccessful +- terminal does not always mean successful + +Meaning for agents: + +- after polling, always inspect the returned object +- never assume “wait finished” means “the run/asset succeeded” + +## 4. File Access Failures Caused by State + +State-related file problems usually mean: + +- computation has no usable results yet +- data asset is still `draft` +- data asset ended in `failed` +- requested file path does not exist within the returned result/data asset tree + +Agent rule: + +- first verify the resource state +- then verify the path using `list_computation_results(...)` or `list_data_asset_files(...)` + +## 5. Practical Interpretation Guide + +When a resource object is returned, explain the failure in this order: + +1. Was the request itself successful? +2. What terminal or current state is the resource in? +3. Is there a structured reason field such as `failure_reason`? +4. Do results/files actually exist? + +Recommended summaries: + +- `Computation.state=failed`: “The run failed.” +- `Computation.end_status=failed`: “The run finished, but the outcome was failure.” +- `Computation.end_status=stopped`: “The run was stopped before successful completion.” +- `DataAsset.state=draft`: “The asset exists but is still being created/indexed.” +- `DataAsset.state=failed`: “The asset creation request was accepted, but the asset later failed.” + +## 6. What Not to Say + +Avoid these incorrect explanations: + +- “The API failed” when a computation/data asset object was returned normally +- “The data asset does not exist” when it is actually in `draft` +- “The file API is broken” before checking whether the resource is ready +- “The token is invalid” for computation/data-asset terminal failures without an HTTP `401` diff --git a/codeocean/references/mcp-guide.md b/codeocean/references/mcp-guide.md new file mode 100644 index 0000000..5249073 --- /dev/null +++ b/codeocean/references/mcp-guide.md @@ -0,0 +1,85 @@ +# Code Ocean MCP Guide + +## Overview + +The MCP server exposes its tool schemas at connection time — use the live schemas as the source of truth for exact tool names and parameters. The tool count and available operations may vary by server version. See [mcp-tools-catalog.md](mcp-tools-catalog.md) for tool groupings and behavioral notes. + +Environment variables: + +- `CODEOCEAN_DOMAIN` +- `CODEOCEAN_TOKEN` +- `AGENT_ID` (optional, defaults to `"AI Agent"`) + +## Core Patterns + +### Find and run a capsule + +1. `search_capsules(search_params={...})` +2. `get_capsule(capsule_id)` if needed +3. `get_capsule_app_panel(capsule_id)` +4. `run_capsule(run_params={...})` +5. `wait_until_completed(computation_id)` +6. `list_computation_results(computation_id)` +7. `get_result_file_urls(...)` or `download_and_read_a_file_from_computation(...)` + +### Create a data asset and wait for readiness + +1. `create_data_asset(data_asset_params={name, tags, mount, ...})` +2. `wait_until_ready(data_asset_object, polling_interval=5, timeout=None)` + +`wait_until_ready` is an MCP wrapper around the SDK method that first reconstructs a full `DataAsset` object, then calls `client.data_assets.wait_until_ready(...)`. + +### Wait for a computation + +`wait_until_completed(computation_id)` is also a wrapper. It first calls `get_computation(computation_id)`, then passes the returned object into `client.computations.wait_until_completed(...)`. + +### Run a pipeline + +What is supported directly in MCP today: + +1. `search_pipelines(search_params={...})` +2. `run_capsule(run_params={pipeline_id: ..., processes: [...]})` +3. `wait_until_completed(...)` + +Important caveat: + +- The MCP server does **not** expose separate `get_pipeline`, `get_pipeline_app_panel`, `attach_pipeline_data_assets`, or `detach_pipeline_data_assets` tools. +- The MCP tools named `get_capsule`, `get_capsule_app_panel`, `list_computations`, `attach_data_assets`, and `detach_data_assets` are implemented via `client.capsules.*`, not `client.pipelines.*`. +- Use SDK or REST when you need guaranteed `/pipelines/...` routing for pipeline inspection or management. + +## Pagination and Compact Search + +MCP search responses have this envelope: + +```json +{ + "items": [...], + "has_more": true, + "next_token": "abc123", + "item_count": 20, + "field_names": {"n": "name", "s": "slug", "d": "description", "t": "tags"} +} +``` + +Notes: + +- `field_names` is present only when `include_field_names=true` +- capsule/pipeline items use `id`, `n`, `s`, `d`, `t` +- data asset items use `id`, `n`, `d`, `t` +- descriptions are truncated to 200 chars +- tags are limited to 10 entries + +Pagination usage: + +1. Call `search_*` with `search_params={query: "...", limit: N}` +2. Read `has_more` and `next_token` +3. Call again with the same `search_params` plus `next_token` + +## Anti-patterns + +- Calling `run_capsule` and then immediately trying to read results without `wait_until_completed` +- Passing a data asset ID string into `wait_until_ready`; it requires the full `DataAsset` object +- Passing plain ID strings into `attach_data_assets` or `attach_computation_data_assets`; attach expects objects +- Treating `type`, `origin`, or `ownership` as data-asset query fields; they are structured search params +- Assuming MCP has full pipeline management parity because it has `search_pipelines`; it does not +- Assuming `download_and_read_*` returns whole files; it reads only the first `50_000` bytes diff --git a/codeocean/references/mcp-server-install.md b/codeocean/references/mcp-server-install.md new file mode 100644 index 0000000..0fd8d0d --- /dev/null +++ b/codeocean/references/mcp-server-install.md @@ -0,0 +1,92 @@ +# MCP Server Installation + +Package name from local `pyproject.toml`: `codeocean-mcp-server` + +Python requirement from local `pyproject.toml`: `>=3.10` + +Recommended launcher from the local README: `uvx codeocean-mcp-server` + +## Basic Config Shape + +All client configs pass these env vars: + +- `CODEOCEAN_DOMAIN` +- `CODEOCEAN_TOKEN` +- optional `AGENT_ID` + +## Claude Desktop + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "", + "AGENT_ID": "Claude Desktop" + } + } + } +} +``` + +## Visual Studio Code + +```json +{ + "mcp": { + "inputs": [ + { + "type": "promptString", + "id": "codeocean-token", + "description": "Code Ocean API Key", + "password": true + } + ], + "servers": { + "codeocean": { + "type": "stdio", + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "${input:codeocean-token}", + "AGENT_ID": "VS Code" + } + } + } + } +} +``` + +## Cline / Roo Code / Cursor / Windsurf + +The local README uses the same executable and env vars for all of them: + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "", + "AGENT_ID": "Client Name" + } + } + } +} +``` + +## Local Testing + +The local README’s inspector command is: + +```bash +npx @modelcontextprotocol/inspector uv tool run codeocean-mcp-server +``` + +Set `CODEOCEAN_DOMAIN` and `CODEOCEAN_TOKEN` in the environment before running it. diff --git a/codeocean/references/mcp-tools-catalog.md b/codeocean/references/mcp-tools-catalog.md new file mode 100644 index 0000000..a405977 --- /dev/null +++ b/codeocean/references/mcp-tools-catalog.md @@ -0,0 +1,49 @@ +# MCP Tools Catalog + +The MCP server exposes its tool schemas at connection time. **Use the live schemas as the source of truth for exact parameter names and types.** This file documents tool groupings, behavioral notes, and anti-patterns that are not visible from the schemas alone. + +## Tool Groups + +### Capsule Search and Management + +`search_capsules`, `get_capsule`, `list_computations`, `attach_data_assets`, `detach_data_assets`, `get_capsule_app_panel` + +### Pipeline Search + +`search_pipelines` — uses the same search params shape as `search_capsules`. + +### Computations + +`get_computation`, `run_capsule`, `wait_until_completed`, `list_computation_results`, `get_result_file_urls`, `download_and_read_a_file_from_computation`, `rename_computation`, `delete_computation`, `attach_computation_data_assets`, `detach_computation_data_assets` + +### Data Assets + +`search_data_assets`, `get_data_asset`, `get_data_asset_file_urls`, `download_and_read_a_file_from_data_asset`, `list_data_asset_files`, `update_metadata`, `wait_until_ready`, `create_data_asset` + +### Custom Metadata + +`get_custom_metadata` + +## Behavioral Notes + +- `wait_until_completed` takes a `computation_id`, internally fetches the object, then polls via the SDK. +- `wait_until_ready` takes the full `DataAsset` object (not just an ID). +- `download_and_read_*` helpers read the first `50_000` bytes only. +- Attach tools expect objects like `{id, mount?}`. Detach tools expect plain ID strings. + +## Pipeline Caveat + +The MCP server has only one pipeline-specific tool: `search_pipelines`. + +- There is no separate MCP `get_pipeline` or `get_pipeline_app_panel`. +- The capsule-named tools (`get_capsule`, `get_capsule_app_panel`, `list_computations`, `attach_data_assets`, `detach_data_assets`) route through `client.capsules.*`, not `client.pipelines.*`. +- Use SDK or REST when you need guaranteed `/pipelines/...` routing. + +## Anti-Patterns + +- Calling `run_capsule` then immediately reading results without `wait_until_completed` +- Passing a data asset ID string into `wait_until_ready` — it requires the full object +- Passing plain ID strings into attach tools — attach expects objects +- Treating `type`, `origin`, or `ownership` as data-asset query fields — they are structured search params +- Assuming MCP has full pipeline management parity because it has `search_pipelines` +- Assuming `download_and_read_*` returns whole files — it reads only the first `50_000` bytes diff --git a/codeocean/references/permissions.md b/codeocean/references/permissions.md new file mode 100644 index 0000000..d677b89 --- /dev/null +++ b/codeocean/references/permissions.md @@ -0,0 +1,64 @@ +# Permissions + +Permissions are available in the SDK and REST API, not in the MCP server. For exact field names and role values, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api) or inspect the installed SDK source. + +## SDK Methods + +Capsules: + +- `client.capsules.get_permissions(capsule_id)` +- `client.capsules.update_permissions(capsule_id, permissions)` + +Pipelines: + +- `client.pipelines.get_permissions(pipeline_id)` +- `client.pipelines.update_permissions(pipeline_id, permissions)` + +Data assets: + +- `client.data_assets.get_permissions(data_asset_id)` +- `client.data_assets.update_permissions(data_asset_id, permissions)` + +## REST Routes + +- `GET /capsules/{id}/permissions` +- `POST /capsules/{id}/permissions` +- `GET /pipelines/{id}/permissions` +- `POST /pipelines/{id}/permissions` +- `GET /data_assets/{id}/permissions` +- `POST /data_assets/{id}/permissions` + +## Example + +```python +from codeocean.models.components import ( + Permissions, + UserPermissions, + GroupPermissions, + UserRole, + GroupRole, + EveryoneRole, +) + +permissions = Permissions( + users=[UserPermissions(email="user@example.com", role=UserRole.Editor)], + groups=[GroupPermissions(group="research-team", role=GroupRole.Viewer)], + everyone=EveryoneRole.Discoverable, + share_assets=True, +) + +client.capsules.update_permissions(capsule_id, permissions) +``` + +Equivalent JSON: + +```json +{ + "users": [{ "email": "user@example.com", "role": "editor" }], + "groups": [{ "group": "research-team", "role": "viewer" }], + "everyone": "discoverable", + "share_assets": true +} +``` + +**Note**: import paths and available role enums may vary by SDK version. Verify against the installed version. diff --git a/codeocean/references/pipelines.md b/codeocean/references/pipelines.md new file mode 100644 index 0000000..192fe63 --- /dev/null +++ b/codeocean/references/pipelines.md @@ -0,0 +1,75 @@ +# Pipelines Reference + +For the full pipeline data model and method list, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api) or inspect the installed SDK source. This file covers pipeline-specific workflow patterns and the MCP caveat. + +## Overview + +The SDK has a dedicated `Pipelines` client that mirrors most capsule operations but routes through `/pipelines/...`. + +## SDK and REST Operations + +| Operation | SDK | REST | +| ------------------------- | --------------------------------------------- | --------------------------------------------- | +| Search | `client.pipelines.search_pipelines(...)` | `POST /pipelines/search` | +| Get | `client.pipelines.get_pipeline(id)` | `GET /pipelines/{id}` | +| App panel | `client.pipelines.get_pipeline_app_panel(id)` | `GET /pipelines/{id}/app_panel` | +| List computations | `client.pipelines.list_computations(id)` | `GET /pipelines/{id}/computations` | +| Permissions get/update | `get_permissions` / `update_permissions` | `GET` / `POST /pipelines/{id}/permissions` | +| Attach/detach data assets | `attach_data_assets` / `detach_data_assets` | `POST` / `DELETE /pipelines/{id}/data_assets` | +| Archive | `archive_pipeline` | `PATCH /pipelines/{id}/archive?archive=true` | +| Delete | `delete_pipeline` | `DELETE /pipelines/{id}` | + +## MCP Caveat + +The MCP server does **not** expose separate pipeline getter/app-panel/attach/detach tools. + +What MCP supports directly: + +- `search_pipelines(...)` +- `run_capsule(run_params={pipeline_id: ...})` + +The MCP tools named `get_capsule`, `get_capsule_app_panel`, `list_computations`, `attach_data_assets`, and `detach_data_assets` are implemented against `client.capsules.*`, not `client.pipelines.*`. Do not document them as guaranteed pipeline tools. + +## Running Pipelines + +Pipelines run through `client.computations.run_capsule(...)` with `pipeline_id` instead of `capsule_id`. Per-step configuration uses `processes`. + +SDK example: + +```python +from codeocean.computation import RunParams, DataAssetsRunParam, PipelineProcessParams, NamedRunParam + +computation = client.computations.run_capsule( + RunParams( + pipeline_id="pipeline-uuid", + data_assets=[DataAssetsRunParam(id="data-uuid", mount="Reference")], + processes=[ + PipelineProcessParams(name="process1", parameters=["val1"]), + PipelineProcessParams( + name="process2", + named_parameters=[NamedRunParam(param_name="threshold", value="0.5")], + ), + ], + ) +) +``` + +REST example: + +```bash +curl -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "pipeline_id":"pipeline-uuid", + "data_assets":[{"id":"data-uuid","mount":"Reference"}], + "processes":[ + {"name":"process1","parameters":["val1"]}, + {"name":"process2","named_parameters":[{"param_name":"threshold","value":"0.5"}]} + ] + }' \ + "$DOMAIN/api/v1/computations" +``` + +## App Panel Process Guidance + +Pipeline app panels can include `processes`, represented by `AppPanelProcess`, to describe per-process categories and parameters. Use the SDK or REST app-panel routes for reliable pipeline inspection. diff --git a/codeocean/references/sdk-guide.md b/codeocean/references/sdk-guide.md new file mode 100644 index 0000000..5f67488 --- /dev/null +++ b/codeocean/references/sdk-guide.md @@ -0,0 +1,166 @@ +# Code Ocean Python SDK Guide + +For exact model fields, enum values, and method signatures, inspect the installed SDK version (`pip show codeocean`) or read the SDK source files directly. This guide covers setup patterns, common imports, and workflow examples. + +## Installation + +```bash +pip install -U codeocean +``` + +Verify the installed version and check compatibility with your Code Ocean deployment before relying on specific features. + +## Client Setup + +```python +from codeocean import CodeOcean + +client = CodeOcean( + domain="https://codeocean.acme.com", + token="YOUR_API_TOKEN", + retries=0, + agent_id="my-agent", # optional +) +``` + +The client exposes sub-clients: `client.capsules`, `client.pipelines`, `client.computations`, `client.data_assets`, `client.custom_metadata`. + +The SDK does not auto-read env vars; pass them explicitly. + +For SDK exception handling and error interpretation, also load [errors-http-and-sdk.md](errors-http-and-sdk.md). + +## Common Imports + +```python +from codeocean import CodeOcean +from codeocean.capsule import CapsuleSearchParams +from codeocean.computation import RunParams, DataAssetsRunParam, NamedRunParam, PipelineProcessParams +from codeocean.data_asset import ( + DataAssetParams, DataAssetSearchParams, DataAssetUpdateParams, + Source, ComputationSource, AWSS3Source, GCPCloudStorageSource, CloudWorkstationSource, + Target, AWSS3Target, +) +from codeocean.models.components import ( + Permissions, UserPermissions, GroupPermissions, UserRole, GroupRole, EveryoneRole, +) +from codeocean import Error +``` + +**Note**: import paths and available classes may vary by SDK version. Verify against the installed version if an import fails. + +## Pagination + +Search result objects expose `.results`, `.has_more`, and `.next_token`. + +Iterator helpers are available for all search methods (e.g., `search_capsules_iterator`, `search_data_assets_iterator`). + +Manual pagination: + +```python +from codeocean.capsule import CapsuleSearchParams + +params = CapsuleSearchParams(query="tag:genomics", limit=20) +results = client.capsules.search_capsules(params) + +for capsule in results.results: + print(capsule.id, capsule.name) + +while results.has_more: + params = CapsuleSearchParams(query="tag:genomics", limit=20, next_token=results.next_token) + results = client.capsules.search_capsules(params) + for capsule in results.results: + print(capsule.id, capsule.name) +``` + +## Polling + +The SDK polling methods take full model objects, not IDs. Both enforce a minimum `polling_interval` of `5`. + +```python +completed = client.computations.wait_until_completed(computation, polling_interval=5, timeout=300) +``` + +```python +ready_asset = client.data_assets.wait_until_ready(data_asset, polling_interval=5, timeout=300) +``` + +**Gotcha**: a terminal object is not always successful. After polling, inspect `state`, `end_status`, and `exit_code` (computations) or `state` and `failure_reason` (data assets). See [errors-resource-states.md](errors-resource-states.md). + +## Example: Run Capsule, Wait, Create Data Asset + +```python +import os + +from codeocean import CodeOcean +from codeocean.computation import RunParams, DataAssetsRunParam +from codeocean.data_asset import DataAssetParams, Source, ComputationSource + +client = CodeOcean( + domain=os.environ["CODEOCEAN_DOMAIN"], + token=os.environ["CODEOCEAN_TOKEN"], +) + +computation = client.computations.run_capsule( + RunParams( + capsule_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890", + parameters=["hg38", "8"], + data_assets=[DataAssetsRunParam(id="d1e2f3a4-b5c6-7890-abcd-ef1234567890", mount="input_data")], + ) +) + +completed = client.computations.wait_until_completed(computation, timeout=3600) + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="RNA-seq Analysis Results", + mount="rna_seq_results", + tags=["rna-seq", "results", "automated"], + description=f"Output of computation {completed.id}", + source=Source(computation=ComputationSource(id=completed.id)), + ) +) + +ready_asset = client.data_assets.wait_until_ready(data_asset, timeout=600) +print(ready_asset.id, ready_asset.state) +``` + +## Example: Create Data Asset from S3 + +```python +from codeocean.data_asset import DataAssetParams, Source, AWSS3Source + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="Reference Genome hg38", + mount="reference_genome", + tags=["reference", "genome", "hg38"], + description="Human reference genome GRCh38 from S3", + source=Source( + aws=AWSS3Source(bucket="my-genomics-bucket", prefix="reference/hg38/"), + ), + ) +) +``` + +## Example: Run Pipeline + +```python +from codeocean.computation import RunParams, DataAssetsRunParam, PipelineProcessParams, NamedRunParam + +computation = client.computations.run_capsule( + RunParams( + pipeline_id="p1q2r3s4-t5u6-7890-abcd-ef1234567890", + data_assets=[DataAssetsRunParam(id="d1e2f3a4-b5c6-7890-abcd-ef1234567890", mount="raw_data")], + processes=[ + PipelineProcessParams(name="alignment", parameters=["STAR", "16"]), + PipelineProcessParams( + name="quantification", + named_parameters=[NamedRunParam(param_name="method", value="salmon")], + ), + ], + ) +) + +completed = client.computations.wait_until_completed(computation, timeout=7200) +print(completed.id, completed.state, completed.end_status) +``` diff --git a/codeocean/references/search-and-pagination.md b/codeocean/references/search-and-pagination.md new file mode 100644 index 0000000..fa6501f --- /dev/null +++ b/codeocean/references/search-and-pagination.md @@ -0,0 +1,65 @@ +# Search and Pagination + +## Query Syntax + +Free text matches weighted fields. `field:value` filters are also supported. + +Rules: + +- same field repeated = OR +- different fields = AND +- quotes for exact phrases +- no explicit `OR` +- no wildcards +- case insensitive + +**Gotcha**: `type`, `origin`, and `ownership` are structured params, not query fields. Do not embed them in the `query` string. + +For the list of supported query fields per resource type, check the live MCP tool schemas or the SDK's search param classes. + +## Structured Search Params + +Both capsule and data asset search accept structured filters like `sort_field`, `sort_order`, `ownership`, `favorite`, `archived`, and `filters`. Data assets additionally support `type` and `origin`. + +Default limit is `100`, max is `1000`. + +## SearchFilter + +The `filters` param accepts structured filter objects with `key`, `value`/`values`, `range` (min/max), and `exclude`. Use these for precise filtering beyond free-text queries. + +## SDK Pagination + +Search result objects expose `.results`, `.has_more`, and `.next_token`. + +Iterator helpers: `search_capsules_iterator`, `search_pipelines_iterator`, `search_data_assets_iterator`. + +Manual example: + +```python +from codeocean.capsule import CapsuleSearchParams + +params = CapsuleSearchParams(query="RNA-seq", limit=100) +results = client.capsules.search_capsules(params) + +for capsule in results.results: + process(capsule) + +while results.has_more: + params = CapsuleSearchParams(query="RNA-seq", limit=100, next_token=results.next_token) + results = client.capsules.search_capsules(params) + for capsule in results.results: + process(capsule) +``` + +## MCP Compact Search + +MCP search responses use abbreviated field names: + +- `n` = name, `s` = slug, `d` = description, `t` = tags + +Pass `include_field_names=true` to get the full mapping. + +Truncation behavior: + +- descriptions truncated to `200` chars +- tags limited to `10` entries diff --git a/codeocean/references/setup-and-auth.md b/codeocean/references/setup-and-auth.md new file mode 100644 index 0000000..806c41e --- /dev/null +++ b/codeocean/references/setup-and-auth.md @@ -0,0 +1,80 @@ +# Setup and Authentication + +## Generate an API Token + +From the public authentication guide: + +1. Sign in to Code Ocean +2. Open `Account` +3. Open `Access Tokens` +4. Click `Generate New Token` +5. Provide a token name +6. Select scopes +7. Click `Add Token` +8. Copy the token immediately, or use `Copy Token & Create Secret` +9. Click `Save Changes` + +The public docs explicitly note that the token is shown only once at creation time. + +## Environment Variables + +SDK and MCP commonly use: + +```bash +export CODEOCEAN_DOMAIN="https://codeocean.acme.com" +export CODEOCEAN_TOKEN="cop_xxxxx" +export AGENT_ID="my-agent" +``` + +Source of truth: + +- `CODEOCEAN_DOMAIN` and `CODEOCEAN_TOKEN` are required by MCP `server.py` +- `AGENT_ID` is optional in MCP `server.py`, defaulting to `"AI Agent"` +- `CodeOcean(...)` accepts `agent_id` and sends it as an `Agent-Id` header when provided + +## Authentication by Access Method + +REST: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +The public auth guide states that Basic Auth uses the token as the username and no password. + +SDK: + +```python +from codeocean import CodeOcean + +client = CodeOcean( + domain="https://codeocean.acme.com", + token="cop_xxxxx", + agent_id="my-agent", # optional +) +``` + +The SDK does not auto-read env vars; pass them explicitly if you store them in the environment. + +MCP: + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "cop_xxxxx", + "AGENT_ID": "Claude Desktop" + } + } + } +} +``` + +## Compatibility + +Public docs, the installed SDK, and the MCP server may target different Code Ocean versions. When providing compatibility guidance, check `pip show codeocean` for the installed SDK version and the MCP server's live tool schemas for the actual available tools. Mention any mismatch explicitly rather than assuming one source is authoritative. diff --git a/codeocean/references/user-guide/capsule.md b/codeocean/references/user-guide/capsule.md new file mode 100644 index 0000000..8652779 --- /dev/null +++ b/codeocean/references/user-guide/capsule.md @@ -0,0 +1,12 @@ +# Capsule + +A Capsule is Code Ocean's fundamental project unit. It bundles the code, data, environment (OS, packages, libraries, dependencies), and results needed to run and share a research workflow reproducibly. + +Every Capsule has seven permanent folders: Metadata, Environment, Code, Data, .codeocean, Scratch, and Results. The Capsule IDE is divided into three panels: File Navigation/App Builder, Editor, and Reproducibility. + +Use this mental model: a Capsule is the place where code is developed and where a reproducible run or cloud workstation session happens. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/cloud-workstation.md b/codeocean/references/user-guide/cloud-workstation.md new file mode 100644 index 0000000..f962b92 --- /dev/null +++ b/codeocean/references/user-guide/cloud-workstation.md @@ -0,0 +1,12 @@ +# Cloud Workstation + +A Cloud Workstation is an interactive IDE session launched inside a Capsule's compute environment. The system starts an EC2 machine and launches a Docker container using the Capsule's environment. Supported IDEs include JupyterLab, RStudio, and VS Code; the appropriate IDE package is installed automatically at launch if not already present. + +Key folders available in a Cloud Workstation: `/code`, `/data`, `/results`, `/metadata`, `/environment`, `/scratch`, and full root filesystem access. Reproducible Runs also get `/code`, `/data`, `/results`, and `/scratch`, but not `/metadata`, `/environment`, or root filesystem access. + +Use this mental model: a Cloud Workstation is interactive development mode for a Capsule, distinct from a headless reproducible run. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/computation.md b/codeocean/references/user-guide/computation.md new file mode 100644 index 0000000..c8fb7f3 --- /dev/null +++ b/codeocean/references/user-guide/computation.md @@ -0,0 +1,12 @@ +# Computation + +A Computation is a single run record for a Capsule or Pipeline. It is the object returned when you trigger a run and later used to track status, inspect results, and fetch output files. The `state` field progresses through `initializing`, `running`, `finalizing`, and `completed`. A separate `end_status` field indicates the outcome: `succeeded`, `failed`, or `stopped`. + +Each Reproducible Run and Cloud Workstation session creates a Computation. Results from completed computations can be captured as Data Assets. + +Use this mental model: a Computation is the execution instance, not the project itself. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/data-asset.md b/codeocean/references/user-guide/data-asset.md new file mode 100644 index 0000000..912d561 --- /dev/null +++ b/codeocean/references/user-guide/data-asset.md @@ -0,0 +1,12 @@ +# Data Asset + +A Data Asset is shared, versioned storage that can be attached to Capsules or Pipelines. Data Assets are mounted read-only into compute containers rather than copied, which improves sharing and performance. Types include datasets, results (captured from a computation), combined assets, and models. + +Data Assets are backed by independent cloud storage (AWS S3 or EFS with intelligent tiering). They can be created from uploaded files, external cloud storage (S3, GCS), or by capturing computation results. The `type` field is one of: `dataset`, `result`, `combined`, or `model`. They support custom metadata for organization and discovery. + +Use this mental model: a Data Asset is durable input/output storage, not executable code. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/pipeline.md b/codeocean/references/user-guide/pipeline.md new file mode 100644 index 0000000..e2776f1 --- /dev/null +++ b/codeocean/references/user-guide/pipeline.md @@ -0,0 +1,12 @@ +# Pipeline + +A Pipeline is a multi-step workflow that connects Capsules and Data Assets into reusable stages. It enables separating workflow stages, automating downstream steps, setting compute resources per step, and parallelizing work. Pipelines are backed by Nextflow scripts that Code Ocean generates and manages. + +Pipeline components include capsule steps (each referencing a Capsule), data asset connections between steps, and per-step parameter and resource configuration. Connection types between steps are: Default (items distributed to parallel instances), Collect (entire dataset available to all instances), and Flatten (each item goes to a separate parallel instance). + +Use this mental model: a Pipeline orchestrates multiple Capsules; it is not where code is primarily developed. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/reproducible-run.md b/codeocean/references/user-guide/reproducible-run.md new file mode 100644 index 0000000..824142a --- /dev/null +++ b/codeocean/references/user-guide/reproducible-run.md @@ -0,0 +1,12 @@ +# Reproducible Run + +A Reproducible Run is Code Ocean's headless execution mode for a Capsule. It executes the Capsule's `run` file end to end without manual input, producing results consistently. The run executes inside a Docker container with access to `/code`, `/data`, `/results`, and `/scratch` folders. Unlike Cloud Workstations, Reproducible Runs do not expose `/metadata`, `/environment`, or root filesystem access. + +Each Reproducible Run creates a Computation and a results timeline entry. Results written to `/results` are preserved and can be captured as a Data Asset. + +Use this mental model: a Reproducible Run is the standard automated execution path, distinct from interactive Cloud Workstation sessions. + +Primary sources: + +- +-