From 05dc2ffd03f34978eda199e9a00e07637ee371de Mon Sep 17 00:00:00 2001 From: Dror Hilman Date: Wed, 22 Apr 2026 11:53:54 +0200 Subject: [PATCH 1/3] skill basic --- .gitignore | 1 + README.md | 140 ++++++- codeocean/SKILL.md | 199 ++++++++++ codeocean/references/capsules.md | 205 ++++++++++ codeocean/references/cli-guide.md | 356 ++++++++++++++++++ codeocean/references/computations.md | 139 +++++++ codeocean/references/custom-metadata.md | 91 +++++ codeocean/references/data-assets.md | 227 +++++++++++ codeocean/references/errors-http-and-sdk.md | 207 ++++++++++ .../references/errors-resource-states.md | 199 ++++++++++ codeocean/references/mcp-guide.md | 91 +++++ codeocean/references/mcp-server-install.md | 92 +++++ codeocean/references/mcp-tools-catalog.md | 210 +++++++++++ codeocean/references/permissions.md | 93 +++++ codeocean/references/pipelines.md | 104 +++++ codeocean/references/sdk-guide.md | 228 +++++++++++ codeocean/references/search-and-pagination.md | 155 ++++++++ codeocean/references/setup-and-auth.md | 87 +++++ codeocean/references/user-guide/capsule.md | 12 + .../user-guide/cloud-workstation.md | 12 + .../references/user-guide/computation.md | 12 + codeocean/references/user-guide/data-asset.md | 12 + codeocean/references/user-guide/pipeline.md | 12 + .../references/user-guide/reproducible-run.md | 12 + 24 files changed, 2894 insertions(+), 2 deletions(-) create mode 100644 .gitignore create mode 100644 codeocean/SKILL.md create mode 100644 codeocean/references/capsules.md create mode 100644 codeocean/references/cli-guide.md create mode 100644 codeocean/references/computations.md create mode 100644 codeocean/references/custom-metadata.md create mode 100644 codeocean/references/data-assets.md create mode 100644 codeocean/references/errors-http-and-sdk.md create mode 100644 codeocean/references/errors-resource-states.md create mode 100644 codeocean/references/mcp-guide.md create mode 100644 codeocean/references/mcp-server-install.md create mode 100644 codeocean/references/mcp-tools-catalog.md create mode 100644 codeocean/references/permissions.md create mode 100644 codeocean/references/pipelines.md create mode 100644 codeocean/references/sdk-guide.md create mode 100644 codeocean/references/search-and-pagination.md create mode 100644 codeocean/references/setup-and-auth.md create mode 100644 codeocean/references/user-guide/capsule.md create mode 100644 codeocean/references/user-guide/cloud-workstation.md create mode 100644 codeocean/references/user-guide/computation.md create mode 100644 codeocean/references/user-guide/data-asset.md create mode 100644 codeocean/references/user-guide/pipeline.md create mode 100644 codeocean/references/user-guide/reproducible-run.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ + diff --git a/README.md b/README.md index b7a5dda..14f0ce9 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,138 @@ -# skills -Code Ocean skills +# Code Ocean Skills + +Skills for AI coding agents to interact with the [Code Ocean](https://codeocean.com) computational research platform. + +## Available Skills + +### `codeocean` + +Teaches AI agents how to use the Code Ocean API through three access methods: + +- **MCP Server** (26 tools) — primary interface for AI agents +- **Python SDK** (`codeocean` package) — for writing Python scripts +- **REST API** (curl/wget) — for shell commands and manual API calls + +Covers capsules, pipelines, computations, data assets, custom metadata, authentication, permissions, search, and pagination. + +## Prerequisites + +1. A Code Ocean account with API access +2. An API access token (generated from Account > Access Tokens) +3. One or more of: + - The Code Ocean MCP server (`codeocean-mcp-server`) for agent-based interaction + - The Python SDK (`pip install codeocean`) for programmatic access + - `curl` for direct REST API calls + +## Quick Start + +### 1. Generate an API Token + +1. Sign into your Code Ocean instance +2. Go to **Account > Access Tokens > Generate New Token** +3. Select scopes (Capsule Read/Write, Datasets Read/Write) +4. Copy the token immediately — it is only shown once + +### 2. Install the MCP Server (for AI agents) + +Install `uv` and Python 3.10+, then configure your agent. Example for Claude Desktop: + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://your-instance.codeocean.com", + "CODEOCEAN_TOKEN": "", + "AGENT_ID": "Claude Desktop" + } + } + } +} +``` + +See `codeocean/references/mcp-server-install.md` for configs for VS Code, Cline, Roo Code, Cursor, and Windsurf. + +### 3. Install the Skill + +- This skill lives in the [`codeocean/skills`](https://github.com/codeocean/skills) repository. +- The skill folder is at [``](https://github.com/codeocean/skills/tree/main/). The entry point is `SKILL.md`. +- **Install the entire skill folder**, not just `SKILL.md` — supporting files (references, templates, examples) are referenced by the entry point. +- The fastest cross-agent option is [`gh skill`](#github-cli-universal-installer) if available. + +| Agent | Install method | Exact path / command | Notes | +|-------|---------------|---------------------|-------| +| **Claude Code** | Manual folder copy | Project: `.claude/skills//`
User: `~/.claude/skills//`
Plugin: `/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). Claude watches these directories automatically. | +| **Codex** | A. Manual folder copy
B. Plugin install | A. `$CWD/.agents/skills//`
`$REPO_ROOT/.agents/skills//`
`$HOME/.agents/skills//`
`/etc/codex/skills//`
B. In-app: add from plugin directory
CLI: `/plugins` → Install plugin | Clone/download from `codeocean/skills`, copy the folder at ``. Codex supports skills natively; packaged distribution is often via plugins. | +| **Cursor** | Plugin-first | Install plugin from marketplace / team marketplace | No official direct raw GitHub skill install documented. Use `gh skill` row below for GitHub-based install. | +| **OpenCode** | Manual folder copy | `.opencode/skills//`
`~/.config/opencode/skills//`
Also compatible:
`.claude/skills//`
`~/.claude/skills//`
`.agents/skills//`
`~/.agents/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). | +| **Antigravity** | Manual folder copy | `.agents/skills//`
`~/.gemini/antigravity/skills//` | Defaults to `.agents/skills`. Copy full folder. | +| **Windsurf** | Manual folder copy | `.windsurf/skills//`
`~/.codeium/windsurf/skills//`
Enterprise: macOS `/Library/Application Support/Windsurf/skills/`, Linux/WSL `/etc/windsurf/skills/`, Windows `C:\ProgramData\Windsurf\skills\` | Each skill is a subdirectory containing `SKILL.md`. Copy full folder. | +| **GitHub Copilot CLI** | Manual folder copy | Project: `.github/skills//`, `.claude/skills//`, `.agents/skills//`
Personal: `~/.copilot/skills//`, `~/.claude/skills//`, `~/.agents/skills//` | Clone/download from `codeocean/skills`, copy the full folder at ``. | +| **VS Code / Copilot agent plugins** | Plugin from Git source | Run `Chat: Install Plugin From Source` → enter Git repo URL | This is for **plugins**, not raw skill folders. Applies only if the skill is wrapped as a plugin. Does not apply to raw skill repos like `codeocean/skills`. | +| **Gemini CLI** | Native GitHub install | `gemini skills install https://github.com/codeocean/skills.git --path `
`gemini skills install /path/to/local/ --scope workspace`
`gemini skills link /path/to/local/ --scope workspace` | Supports Git repo, local dir, zipped `.skill`, monorepo subpath, workspace/user scope. Use `--path` for monorepo subpath. | +| **Cline** | Manual folder copy | `.cline/skills//`
`~/.cline/skills//` | Enable Skills in Settings → Features → Enable Skills. Experimental. | +| **Kiro IDE** | Native GitHub import | Agent Steering & Skills → `+` → Import a skill → GitHub → paste `https://github.com/codeocean/skills/tree/main/` | URL must point to the subdirectory, not the repo root. Imported skills are copied into the skills directory. | +| **Kiro CLI** | Manual folder copy | `.kiro/skills//`
`~/.kiro/skills//` | Default agent auto-loads skills. Custom agents need `skill://` resources configured. | +| **`gh skill` (GitHub CLI)** | Universal GitHub install | `gh skill install codeocean/skills `
`gh skill install codeocean/skills --agent claude-code`
`gh skill install codeocean/skills --agent cursor`
`gh skill install codeocean/skills --agent codex`
`gh skill install codeocean/skills --agent gemini`
`gh skill install codeocean/skills --agent antigravity` | Installs to the correct host directory automatically. Can pin versions/commits. Cleanest cross-agent GitHub-hosted option. | + +#### Shared patterns + +- **Manual folder copy**: Claude Code, Codex, OpenCode, Antigravity, Windsurf, GitHub Copilot CLI, Cline, Kiro CLI — copy the skill directory (containing `SKILL.md` and supporting files) into the agent's watched skills path. +- **Native GitHub import/install**: Gemini CLI (`gemini skills install`), Kiro IDE (GitHub import UI) — install directly from `codeocean/skills` repo. +- **Plugin-first**: Cursor, VS Code / Copilot agent plugins — skill distribution is via marketplace or Git-source plugins, not raw skill folders. +- **Universal GitHub installer**: `gh skill install codeocean/skills ` — routes to the correct agent directory automatically; works across Claude Code, Codex, Cursor, Gemini, Antigravity. + +#### References + +- [Claude Code — Skills](https://code.claude.com/docs/en/skills) +- [Codex — Skills](https://developers.openai.com/codex/skills) +- [Codex — Plugins](https://developers.openai.com/codex/plugins) +- [OpenCode — Skills](https://opencode.ai/docs/skills) +- [Antigravity — Skills](https://antigravity.google/docs/skills) +- [Windsurf — Skills](https://docs.windsurf.com/windsurf/cascade/skills) +- [GitHub Copilot CLI — Skills](https://docs.github.com/en/copilot/how-tos/copilot-cli/customize-copilot/add-skills) +- [VS Code — Agent plugins](https://code.visualstudio.com/docs/copilot/customization/agent-plugins) +- [Gemini CLI — Skills](https://geminicli.com/docs/cli/skills/) +- [Cline — Skills](https://docs.cline.bot/customization/skills) +- [Kiro IDE — Skills](https://kiro.dev/docs/skills/) +- [Kiro CLI — Skills](https://kiro.dev/docs/cli/skills/) +- [`gh skill` — GitHub CLI](https://github.blog/changelog/2026-04-16-manage-agent-skills-with-github-cli/) + +## Skill Structure + +``` +codeocean/ +├── SKILL.md # Main skill — workflows, decision tree, concepts +└── references/ + ├── mcp-guide.md # MCP workflow patterns and anti-patterns + ├── mcp-tools-catalog.md # All 26 MCP tools with parameter schemas + ├── mcp-server-install.md # Install configs for 6 editors/agents + ├── cli-guide.md # curl/wget endpoint reference + ├── sdk-guide.md # Python SDK setup and examples + ├── setup-and-auth.md # Token generation and environment variables + ├── capsules.md # Capsule data model and operations + ├── pipelines.md # Pipeline-specific operations + ├── computations.md # Running, waiting, result retrieval + ├── data-assets.md # Data asset creation and lifecycle + ├── custom-metadata.md # Admin-defined metadata schema + ├── search-and-pagination.md # Query syntax and pagination + └── permissions.md # User/group access control +``` + +## How It Works + +The skill uses **progressive disclosure**: + +1. **SKILL.md** loads when the skill triggers — contains workflow patterns and a decision tree +2. **Reference files** load on demand — agents only read the files relevant to their current task + +This keeps context window usage efficient while providing deep coverage of the entire Code Ocean API. + +## Links + +- [Code Ocean User Guide](https://docs.codeocean.com/user-guide) — platform documentation +- [Code Ocean API Documentation](https://docs.codeocean.com/user-guide/code-ocean-api) — REST API reference +- [Code Ocean Python SDK](https://github.com/codeocean/codeocean-sdk-python) — GitHub repo +- [Code Ocean MCP Server](https://github.com/codeocean/codeocean-mcp-server) — GitHub repo diff --git a/codeocean/SKILL.md b/codeocean/SKILL.md new file mode 100644 index 0000000..08c9376 --- /dev/null +++ b/codeocean/SKILL.md @@ -0,0 +1,199 @@ +--- +name: codeocean +description: "Guide for interacting with the Code Ocean computational research platform via its MCP server (25 tools), Python SDK, and REST API (curl). Use when an agent or user needs to: (1) search, run, or manage capsules and pipelines, (2) manage computations and retrieve results, (3) create, search, or manage data assets, (4) write Python scripts using the codeocean SDK, (5) guide users through curl/REST API calls to Code Ocean, (6) set up Code Ocean API authentication and MCP server configuration, or (7) orchestrate end-to-end computational workflows on Code Ocean." +--- + +# Code Ocean Skill + +## 1. Overview + +Code Ocean resources covered by this skill: + +- **Capsules**: runnable computational units. +- **Pipelines**: multi-step workflows with their own `/pipelines/...` API surface. +- **Computations**: capsule or pipeline runs. +- **Data Assets**: immutable datasets, results, combined assets, and models. +- **Custom Metadata**: deployment-defined schema for data assets. + +Three access methods: + +| Method | When to use | Reference | +|--------|------------|-----------| +| **MCP Server** (25 tools) | Primary for agentic workflows | [mcp-guide.md](references/mcp-guide.md), [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | +| **Python SDK** (`codeocean`) | Python scripts and typed integrations | [sdk-guide.md](references/sdk-guide.md) | +| **REST API** (curl) | Shell automation and raw HTTP | [cli-guide.md](references/cli-guide.md) | +| **Setup/Auth** | Token generation, compatibility, MCP install | [setup-and-auth.md](references/setup-and-auth.md), [mcp-server-install.md](references/mcp-server-install.md) | +| **Errors** | Interpreting failures and choosing the next action | [errors-http-and-sdk.md](references/errors-http-and-sdk.md), [errors-resource-states.md](references/errors-resource-states.md) | +| **User Guide Concepts** | Short product-level meanings from the user guide | [user-guide/](references/user-guide/) | + +## 2. Core Workflows + +### Workflow 1: Find and Run a Capsule via MCP + +1. `search_capsules(search_params={...})` +2. `get_capsule(capsule_id)` if you need full metadata +3. `get_capsule_app_panel(capsule_id)` before running +4. `run_capsule(run_params={...})` +5. `wait_until_completed(computation_id)` +6. `list_computation_results(computation_id)` +7. `get_result_file_urls(...)` or `download_and_read_a_file_from_computation(...)` + +Minimal run payload: + +```json +{ + "capsule_id": "", + "data_assets": [{"id": "", "mount": ""}], + "parameters": ["value1", "value2"], + "named_parameters": [{"param_name": "threshold", "value": "0.5"}] +} +``` + +### Workflow 2: Capture Results as a Data Asset + +1. `create_data_asset(data_asset_params={...})` +2. `wait_until_ready(data_asset_object)` +3. `get_data_asset(data_asset_id)` if you need the refreshed object + +`wait_until_ready` takes the full `DataAsset` object in both MCP and SDK, not just the ID. + +### Workflow 3: Run a Pipeline + +Current MCP support is partial: + +1. `search_pipelines(search_params={...})` to find the pipeline +2. Use SDK or REST for guaranteed pipeline metadata/app-panel access via `/pipelines/...` +3. `run_capsule(run_params={pipeline_id: ..., processes: [...]})` +4. `wait_until_completed(...)` +5. `list_computation_results(...)` + +Pipeline run payload: + +```json +{ + "pipeline_id": "", + "data_assets": [{"id": "", "mount": ""}], + "processes": [ + {"name": "step1", "parameters": ["val1"]}, + {"name": "step2", "named_parameters": [{"param_name": "k", "value": "v"}]} + ] +} +``` + +### Workflow 4: Explore Data Assets + +1. `search_data_assets(search_params={query: "...", type: "dataset"})` +2. `get_data_asset(data_asset_id)` +3. `list_data_asset_files(data_asset_id, path="")` +4. `get_data_asset_file_urls(...)` or `download_and_read_a_file_from_data_asset(...)` + +### Workflow 5: Attach and Detach Data Assets + +| Context | Attach | Detach | +|---------|--------|--------| +| Capsule | `attach_data_assets(capsule_id, attach_params=[...])` | `detach_data_assets(capsule_id, data_assets=[...])` | +| Cloud workstation computation | `attach_computation_data_assets(computation_id, attach_params=[...])` | `detach_computation_data_assets(computation_id, data_assets=[...])` | + +Attach expects objects like `{id, mount?}`. Detach expects plain ID strings. + +## 3. Key Concepts + +### RunParams + +`RunParams` fields in the local SDK are: + +- `capsule_id` +- `pipeline_id` +- `version` +- `resume_run_id` +- `nextflow_profile` +- `data_assets` +- `parameters` +- `named_parameters` +- `processes` + +### Search Fields + +`query` supports free text plus `field:value` filters. + +- Capsule query fields: `id`, `name`, `doi`, `tag`, `field`, `affiliation`, `journal`, `article`, `author` +- Data asset query fields: `name`, `tag`, `run_script`, `commit_id`, `contained_data_id` + +Structured filters are separate from `query`: + +- Capsule: `ownership`, `status`, `favorite`, `archived`, `sort_field`, `sort_order`, `filters` +- Data asset: `type`, `origin`, `ownership`, `favorite`, `archived`, `sort_field`, `sort_order`, `filters` + +### MCP Compact Search Format + +- Capsules/pipelines: `id`, `n`, `s`, `d`, `t` +- Data assets: `id`, `n`, `d`, `t` + +Response envelope fields: + +- `items` +- `has_more` +- `next_token` +- `item_count` +- optional `field_names` + +Descriptions are truncated to 200 characters. Tags are limited to 10 entries. + +### File Reading Limit + +The MCP `download_and_read_*` helpers read and decode the first `50_000` bytes of the remote file response. Use `get_*_file_urls` when you need the complete file. + +### Error Interpretation + +Code Ocean failures come from two different layers: + +- HTTP/API failures: `400`, `401`, `403`, `404`, `429`, `5xx`, SDK `Error`, curl non-2xx responses +- Resource-state failures: a computation or data asset request succeeds, but the returned object is in a failed terminal state + +When an agent sees an error, it should first classify it: + +- If the request itself failed, load [errors-http-and-sdk.md](references/errors-http-and-sdk.md) +- If the request succeeded but the resource state is bad, load [errors-resource-states.md](references/errors-resource-states.md) + +### Compatibility Note + +The public docs and local repos are not perfectly aligned: + +- Public Python SDK docs currently say Python `>=3.11` and Code Ocean `>=2.19`. +- Local SDK `pyproject.toml` declares Python `>=3.9`. +- Local SDK client sends `Min-Server-Version: 4.3.0`. +- Local MCP package requires Python `>=3.10` and depends on `codeocean>=0.14.0,<0.15.0`. + +When they disagree, prefer the local SDK/MCP source for current callable names and payload shapes, and mention the public-doc mismatch explicitly. + +## 4. Reference Index + +| File | Load when... | +|------|-------------| +| [mcp-guide.md](references/mcp-guide.md) | MCP workflows and caveats | +| [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | Exact MCP tool names and params | +| [sdk-guide.md](references/sdk-guide.md) | Python SDK setup, imports, examples | +| [cli-guide.md](references/cli-guide.md) | curl routes, methods, and payloads | +| [capsules.md](references/capsules.md) | Capsule model, search, app panel, permissions | +| [pipelines.md](references/pipelines.md) | Pipeline-specific SDK/REST guidance | +| [computations.md](references/computations.md) | Runs, polling, results, cloud workstations | +| [data-assets.md](references/data-assets.md) | Data asset model and lifecycle | +| [search-and-pagination.md](references/search-and-pagination.md) | Query syntax and pagination behavior | +| [permissions.md](references/permissions.md) | Permissions model and routes | +| [custom-metadata.md](references/custom-metadata.md) | Custom metadata schema | +| [errors-http-and-sdk.md](references/errors-http-and-sdk.md) | HTTP status codes, SDK `Error`, retry meaning | +| [errors-resource-states.md](references/errors-resource-states.md) | Failed computations/data assets and how to read them | +| [setup-and-auth.md](references/setup-and-auth.md) | Tokens, env vars, compatibility | +| [mcp-server-install.md](references/mcp-server-install.md) | MCP server installation | +| [user-guide/capsule.md](references/user-guide/capsule.md) | Product-level definition of a Capsule | +| [user-guide/pipeline.md](references/user-guide/pipeline.md) | Product-level definition of a Pipeline | +| [user-guide/data-asset.md](references/user-guide/data-asset.md) | Product-level definition of a Data Asset | +| [user-guide/computation.md](references/user-guide/computation.md) | Product-level definition of a Computation | +| [user-guide/cloud-workstation.md](references/user-guide/cloud-workstation.md) | Product-level definition of a Cloud Workstation | +| [user-guide/reproducible-run.md](references/user-guide/reproducible-run.md) | Product-level definition of a Reproducible Run | + +## 5. External Links + +- [Code Ocean User Guide](https://docs.codeocean.com/user-guide) +- [Code Ocean MCP Server](https://github.com/codeocean/codeocean-mcp-server) +- [Code Ocean Python SDK](https://github.com/codeocean/codeocean-sdk-python) diff --git a/codeocean/references/capsules.md b/codeocean/references/capsules.md new file mode 100644 index 0000000..0a67892 --- /dev/null +++ b/codeocean/references/capsules.md @@ -0,0 +1,205 @@ +# Capsules Reference + +## Data Model + +`Capsule` fields in the local SDK model include: + +- `id` +- `created` +- `name` +- `status` +- `owner` +- `slug` +- `owner_email` +- `last_accessed` +- `article` +- `cloned_from_url` +- `description` +- `field` +- `tags` +- `original_capsule` +- `release_capsule` +- `submission` +- `versions` + +## Search + +MCP: + +```text +search_capsules(search_params={query: "RNA-seq", limit: 10}) +``` + +SDK: + +```python +from codeocean.capsule import CapsuleSearchParams + +results = client.capsules.search_capsules( + CapsuleSearchParams(query="RNA-seq", limit=10) +) +``` + +Query fields from `CapsuleSearchParams`: + +- `id` +- `name` +- `doi` +- `tag` +- `field` +- `affiliation` +- `journal` +- `article` +- `author` + +Structured search params: + +- `sort_field`: `created`, `last_accessed`, `name` +- `sort_order`: `asc`, `desc` +- `ownership`: `private`, `created`, `shared` +- `status`: `release`, `non_release` +- `favorite` +- `archived` +- `filters` + +## Get Capsule + +SDK: + +```python +capsule = client.capsules.get_capsule(capsule_id) +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +MCP: + +```text +get_capsule(capsule_id) +``` + +## App Panel + +SDK: + +```python +app_panel = client.capsules.get_capsule_app_panel(capsule_id, version=None) +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/capsules/$CAPSULE_ID/app_panel" +``` + +Relevant app-panel model sections: + +- `general` +- `data_assets` +- `categories` +- `parameters` +- `results` +- `processes` + +Relevant nested classes from the SDK: + +- `AppPanelGeneral` +- `AppPanelDataAsset` +- `AppPanelCategories` +- `AppPanelParameters` +- `AppPanelResult` +- `AppPanelProcess` + +App-panel enums: + +- `AppPanelDataAssetKind`: `internal`, `external`, `combined` +- `AppPanelParameterType`: `text`, `list`, `file` + +## Computations for a Capsule + +SDK: + +```python +computations = client.capsules.list_computations(capsule_id) +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/capsules/$CAPSULE_ID/computations" +``` + +MCP: + +```text +list_computations(capsule_id) +``` + +## Attach and Detach Data Assets + +Attach shape: + +```python +from codeocean.data_asset import DataAssetAttachParams + +client.capsules.attach_data_assets( + capsule_id, + [DataAssetAttachParams(id="data-asset-uuid", mount="input")], +) +``` + +Detach shape: + +```python +client.capsules.detach_data_assets(capsule_id, ["data-asset-uuid"]) +``` + +REST routes: + +- `POST /capsules/{id}/data_assets` with a list of attach objects +- `DELETE /capsules/{id}/data_assets` with a list of ID strings + +## Permissions + +Permissions are SDK/REST only, not MCP. + +Imports: + +```python +from codeocean.models.components import ( + Permissions, + UserPermissions, + GroupPermissions, + UserRole, + GroupRole, + EveryoneRole, +) +``` + +SDK methods: + +- `client.capsules.get_permissions(capsule_id)` +- `client.capsules.update_permissions(capsule_id, permissions)` + +REST routes: + +- `GET /capsules/{id}/permissions` +- `POST /capsules/{id}/permissions` + +## Archive and Delete + +SDK: + +- `client.capsules.archive_capsule(capsule_id, archive=True)` +- `client.capsules.delete_capsule(capsule_id)` + +REST: + +- `PATCH /capsules/{id}/archive?archive=true` +- `DELETE /capsules/{id}` + +The local SDK/doc sources confirm the methods and routes above. They do not establish an additional delete precondition beyond normal API permissions. diff --git a/codeocean/references/cli-guide.md b/codeocean/references/cli-guide.md new file mode 100644 index 0000000..b10924d --- /dev/null +++ b/codeocean/references/cli-guide.md @@ -0,0 +1,356 @@ +# Code Ocean CLI Guide + +Base URL: `https://{domain}/api/v1/` + +## Authentication + +Code Ocean uses HTTP Basic Auth with the access token as the username and an empty password. + +```bash +export CODEOCEAN_DOMAIN="https://codeocean.acme.com" +export CODEOCEAN_TOKEN="cop_xxxxx" + +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/custom_metadata" +``` + +For JSON request bodies, add `-H "Content-Type: application/json"`. + +For status-code meanings and how to react to failures, also load [errors-http-and-sdk.md](errors-http-and-sdk.md). + +## Capsules + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"my capsule\" tag:genomics","limit":20}' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/search" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/app_panel" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/computations" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '[{"id":"DATA_ASSET_ID"}]' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/data_assets" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + -H "Content-Type: application/json" \ + -d '["DATA_ASSET_ID"]' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/data_assets" +``` + +Permissions: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/permissions" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X POST \ + -H "Content-Type: application/json" \ + -d '{"users":[{"email":"user@example.com","role":"editor"}],"everyone":"discoverable","share_assets":true}' \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/permissions" +``` + +Archive: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PATCH \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID/archive?archive=true" +``` + +Delete: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +## Pipelines + +Pipeline routes use `/pipelines/...`. + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"my pipeline\"","limit":20}' \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/search" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/app_panel" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/computations" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '[{"id":"DATA_ASSET_ID"}]' \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/data_assets" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + -H "Content-Type: application/json" \ + -d '["DATA_ASSET_ID"]' \ + "$CODEOCEAN_DOMAIN/api/v1/pipelines/$PIPELINE_ID/data_assets" +``` + +## Computations + +Run a capsule: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "capsule_id":"CAPSULE_ID", + "parameters":["value1","value2"], + "named_parameters":[{"param_name":"threshold","value":"0.5"}], + "data_assets":[{"id":"DATA_ASSET_ID","mount":"input_data"}] + }' \ + "$CODEOCEAN_DOMAIN/api/v1/computations" +``` + +Run a pipeline: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "pipeline_id":"PIPELINE_ID", + "processes":[ + {"name":"process_name","parameters":["value1"]}, + {"name":"process_name_2","named_parameters":[{"param_name":"param1","value":"value1"}]} + ] + }' \ + "$CODEOCEAN_DOMAIN/api/v1/computations" +``` + +Get a computation: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID" +``` + +List results uses `POST` with a JSON body containing `path`: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"path":""}' \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/results" +``` + +Get result file URLs: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/results/urls?path=output.csv" +``` + +Rename: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PATCH \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID?name=my-analysis-run" +``` + +Delete: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID" +``` + +Cloud workstation data assets: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '[{"id":"DATA_ASSET_ID","mount":"work"}]' \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/data_assets" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + -H "Content-Type: application/json" \ + -d '["DATA_ASSET_ID"]' \ + "$CODEOCEAN_DOMAIN/api/v1/computations/$COMPUTATION_ID/data_assets" +``` + +## Data Assets + +Search: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"reference genome\"","type":"dataset","limit":20}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/search" +``` + +Get: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID" +``` + +Create from computation results: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "name":"Analysis Results", + "mount":"analysis_results", + "tags":["rna-seq","results"], + "description":"Output from RNA-seq pipeline", + "source":{"computation":{"id":"COMPUTATION_ID"}} + }' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets" +``` + +Create from S3: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "name":"External Reference Data", + "mount":"reference", + "tags":["reference"], + "description":"Reference genome from S3", + "source":{"aws":{"bucket":"my-bucket","prefix":"reference/genome/"}} + }' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets" +``` + +Update metadata: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PUT \ + -H "Content-Type: application/json" \ + -d '{"name":"Updated Name","description":"Updated description","tags":["updated-tag"]}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID" +``` + +List files uses `POST` with `{"path": ...}`: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"path":""}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/files" +``` + +Get file URLs: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/files/urls?path=data.csv" +``` + +Permissions: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/permissions" +``` + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X POST \ + -H "Content-Type: application/json" \ + -d '{"users":[{"email":"user@example.com","role":"editor"}],"groups":[{"group":"research-team","role":"viewer"}],"everyone":"discoverable","share_assets":true}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/permissions" +``` + +Archive: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X PATCH \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/archive?archive=true" +``` + +Delete: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X DELETE \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID" +``` + +Transfer: + +```bash +curl -u "$CODEOCEAN_TOKEN:" -X POST \ + -H "Content-Type: application/json" \ + -d '{"target":{"aws":{"bucket":"new-bucket","prefix":"new-prefix/"}},"force":false}' \ + "$CODEOCEAN_DOMAIN/api/v1/data_assets/$DATA_ASSET_ID/transfer" +``` + +## Shell Workflow Example + +```bash +#!/usr/bin/env bash +set -euo pipefail + +DOMAIN="$CODEOCEAN_DOMAIN" +TOKEN="$CODEOCEAN_TOKEN" + +CAPSULE_ID=$(curl -s -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"query":"name:\"RNA-seq Analysis\"","limit":1}' \ + "$DOMAIN/api/v1/capsules/search" | jq -r '.results[0].id') + +COMPUTATION_ID=$(curl -s -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d "{\"capsule_id\":\"$CAPSULE_ID\",\"parameters\":[\"hg38\"]}" \ + "$DOMAIN/api/v1/computations" | jq -r '.id') + +while true; do + STATE=$(curl -s -u "$TOKEN:" "$DOMAIN/api/v1/computations/$COMPUTATION_ID" | jq -r '.state') + if [ "$STATE" = "completed" ] || [ "$STATE" = "failed" ]; then + break + fi + sleep 5 +done + +curl -s -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{"path":""}' \ + "$DOMAIN/api/v1/computations/$COMPUTATION_ID/results" | jq '.items[].path' + +URL=$(curl -s -u "$TOKEN:" \ + "$DOMAIN/api/v1/computations/$COMPUTATION_ID/results/urls?path=output.csv" | jq -r '.download_url') +curl -o output.csv "$URL" +``` diff --git a/codeocean/references/computations.md b/codeocean/references/computations.md new file mode 100644 index 0000000..e177106 --- /dev/null +++ b/codeocean/references/computations.md @@ -0,0 +1,139 @@ +# Computations Reference + +## Data Model + +`Computation` fields: + +- `id` +- `created` +- `name` +- `owner` +- `run_time` +- `state` +- `owner_email` +- `cloud_workstation` +- `data_assets` +- `parameters` +- `nextflow_profile` +- `processes` +- `end_status` +- `exit_code` +- `has_results` + +Enums: + +- `ComputationState`: `initializing`, `running`, `finalizing`, `completed`, `failed` +- `ComputationEndStatus`: `succeeded`, `failed`, `stopped` + +## RunParams + +`RunParams` fields in the local SDK: + +- `capsule_id` +- `pipeline_id` +- `version` +- `resume_run_id` +- `nextflow_profile` +- `data_assets` +- `parameters` +- `named_parameters` +- `processes` + +Nested run models: + +- `DataAssetsRunParam`: `id`, `mount` +- `NamedRunParam`: `param_name`, `value` +- `PipelineProcessParams`: `name`, `parameters`, `named_parameters` + +## Running + +MCP: + +```text +run_capsule({ + capsule_id: "capsule-uuid", + parameters: ["val1", "val2"], + data_assets: [{id: "data-uuid", mount: "data"}] +}) +``` + +SDK: + +```python +from codeocean.computation import RunParams, DataAssetsRunParam + +computation = client.computations.run_capsule( + RunParams( + capsule_id="capsule-uuid", + parameters=["val1", "val2"], + data_assets=[DataAssetsRunParam(id="data-uuid", mount="data")], + ) +) +``` + +Pipeline runs also use `run_capsule(...)`, with `pipeline_id`. + +## Waiting for Completion + +MCP: + +```text +wait_until_completed(computation_id) +``` + +SDK: + +```python +completed = client.computations.wait_until_completed( + computation, + polling_interval=5, + timeout=300, +) +``` + +SDK note: + +- the method takes a full `Computation` object +- minimum polling interval is `5` + +MCP note: + +- the tool takes `computation_id` +- it internally calls `get_computation(computation_id)` first, then the SDK polling method + +## Result Files + +List results: + +- SDK: `client.computations.list_computation_results(computation_id, path="")` +- REST: `POST /computations/{id}/results` with `{"path": ""}` +- MCP: `list_computation_results(computation_id)` + +Get file URLs: + +- SDK: `client.computations.get_result_file_urls(computation_id, path)` +- REST: `GET /computations/{id}/results/urls?path=...` +- MCP: `get_result_file_urls(computation_id, file_path)` +- Return shape: `{download_url, view_url}` + +Read file content: + +- MCP: `download_and_read_a_file_from_computation(computation_id, file_path)` +- Current helper behavior: reads and decodes the first `50_000` bytes + +## Rename and Delete + +- SDK: `rename_computation(computation_id, name)` +- SDK: `delete_computation(computation_id)` +- REST: `PATCH /computations/{id}?name=...` +- REST: `DELETE /computations/{id}` + +## Cloud Workstation Data Assets + +Use computation-level attach/detach APIs for cloud workstation sessions: + +- SDK: `client.computations.attach_data_assets(computation_id, attach_params)` +- SDK: `client.computations.detach_data_assets(computation_id, data_assets)` +- REST: `POST /computations/{id}/data_assets` +- REST: `DELETE /computations/{id}/data_assets` +- MCP: `attach_computation_data_assets(...)`, `detach_computation_data_assets(...)` diff --git a/codeocean/references/custom-metadata.md b/codeocean/references/custom-metadata.md new file mode 100644 index 0000000..5191881 --- /dev/null +++ b/codeocean/references/custom-metadata.md @@ -0,0 +1,91 @@ +# Custom Metadata Reference + +## Schema Types + +Types from [`custom_metadata.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/custom_metadata.py:1): + +- `CustomMetadata` +- `CustomMetadataField` +- `CustomMetadataFieldType` +- `CustomMetadataFieldRange` + +`CustomMetadataFieldType` values: + +- `string` +- `number` +- `date` + +`CustomMetadataField` fields: + +- `name` +- `type` +- `range` +- `allowed_values` +- `multiple` +- `units` +- `category` +- `required` + +`allowed_values` can be either: + +- `list[str]` +- `list[float]` + +`CustomMetadata` fields: + +- `fields` +- `categories` + +## Get Schema + +SDK: + +```python +schema = client.custom_metadata.get_custom_metadata() +``` + +MCP: + +```text +get_custom_metadata() +``` + +REST: + +```bash +curl -u "$TOKEN:" "$DOMAIN/api/v1/custom_metadata" +``` + +Route: `GET /custom_metadata` + +## Using Custom Metadata on Data Assets + +Custom metadata values are supplied inside the `custom_metadata` dict when creating or updating a data asset. + +Examples: + +- string field: `"species": "mouse"` +- multi-string field: `"species": ["mouse", "rat"]` +- number field: `"sample_count": 42` +- multi-number field: `"thresholds": [0.1, 0.2]` +- date field: `"experiment_date": 1700000000` + +Create example: + +```text +create_data_asset({ + name: "Sample Dataset", + mount: "sample_dataset", + tags: ["experiment", "genomics"], + source: {computation: {id: "comp-uuid"}}, + custom_metadata: {"species": "human", "sample_count": 42} +}) +``` + +Update example: + +```text +update_metadata(data_asset_id, { + custom_metadata: {"species": "mouse", "experiment_date": 1700000000} +}) +``` diff --git a/codeocean/references/data-assets.md b/codeocean/references/data-assets.md new file mode 100644 index 0000000..51541e5 --- /dev/null +++ b/codeocean/references/data-assets.md @@ -0,0 +1,227 @@ +# Data Assets Reference + +## Data Model + +`DataAsset` fields in the local SDK: + +- `id` +- `created` +- `name` +- `mount` +- `last_used` +- `owner` +- `state` +- `type` +- `files` +- `size` +- `description` +- `owner_email` +- `tags` +- `provenance` +- `source_bucket` +- `custom_metadata` +- `app_parameters` +- `nextflow_profile` +- `contained_data_assets` +- `last_transferred` +- `transfer_error` +- `failure_reason` + +Enums: + +- `DataAssetState`: `draft`, `ready`, `failed` +- `DataAssetType`: `dataset`, `result`, `combined`, `model` +- `DataAssetSearchOrigin`: `internal`, `external` + +Important nested models: + +- `Provenance` +- `SourceBucket` +- `ResultsInfo` +- `ContainedDataAsset` + +## Search + +MCP: + +```text +search_data_assets(search_params={query: "genomics", type: "dataset", limit: 10}) +``` + +SDK: + +```python +from codeocean.data_asset import DataAssetSearchParams, DataAssetType + +results = client.data_assets.search_data_assets( + DataAssetSearchParams(query="genomics", type=DataAssetType.Dataset, limit=10) +) +``` + +Query fields: + +- `name` +- `tag` +- `run_script` +- `commit_id` +- `contained_data_id` + +Structured params: + +- `type` +- `origin` +- `ownership` +- `favorite` +- `archived` +- `sort_field` +- `sort_order` +- `filters` + +Sort fields: + +- `created` +- `type` +- `name` +- `size` + +## Files and URLs + +Get: + +- SDK: `client.data_assets.get_data_asset(data_asset_id)` +- REST: `GET /data_assets/{id}` +- MCP: `get_data_asset(data_asset_id)` + +List files: + +- SDK: `client.data_assets.list_data_asset_files(data_asset_id, path="")` +- REST: `POST /data_assets/{id}/files` with `{"path": ""}` +- MCP: `list_data_asset_files(data_asset_id, path="")` + +Get file URLs: + +- SDK: `client.data_assets.get_data_asset_file_urls(data_asset_id, path)` +- REST: `GET /data_assets/{id}/files/urls?path=...` +- MCP: `get_data_asset_file_urls(data_asset_id, file_path)` +- Return shape: `{download_url, view_url}` + +Read file content: + +- MCP: `download_and_read_a_file_from_data_asset(data_asset_id, file_path)` +- Current helper behavior: reads and decodes the first `50_000` bytes + +## Creating Data Assets + +`DataAssetParams` fields: + +- `name` +- `tags` +- `mount` +- `description` +- `source` +- `target` +- `custom_metadata` +- `data_asset_ids` +- `results_info` + +The local SDK model requires `name`, `tags`, and `mount`. + +Source models: + +- `Source` +- `AWSS3Source` +- `GCPCloudStorageSource` +- `ComputationSource` +- `CloudWorkstationSource` + +Target models: + +- `Target` +- `AWSS3Target` + +Example from computation results: + +```python +from codeocean.data_asset import DataAssetParams, Source, ComputationSource + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="My Results", + mount="my_results", + tags=["results"], + source=Source(computation=ComputationSource(id="comp-uuid")), + ) +) +``` + +Combined data asset example: + +```python +from codeocean.data_asset import DataAssetParams + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="Combined Dataset", + mount="combined_dataset", + tags=["combined"], + data_asset_ids=["id1", "id2"], + ) +) +``` + +## Wait Until Ready + +SDK: + +```python +ready_asset = client.data_assets.wait_until_ready(data_asset, polling_interval=5, timeout=300) +``` + +MCP: + +```text +wait_until_ready(data_asset_object, polling_interval=5, timeout=None) +``` + +Notes: + +- both SDK and MCP require the full `DataAsset` object +- minimum polling interval is `5` + +## Update, Permissions, Archive, Delete + +Update metadata: + +- SDK: `client.data_assets.update_metadata(data_asset_id, update_params)` +- REST: `PUT /data_assets/{id}` +- MCP: `update_metadata(data_asset_id, update_params)` + +`DataAssetUpdateParams` fields: + +- `name` +- `description` +- `tags` +- `mount` +- `custom_metadata` + +Permissions: + +- SDK: `get_permissions`, `update_permissions` +- REST: `GET /data_assets/{id}/permissions`, `POST /data_assets/{id}/permissions` + +Archive/delete: + +- SDK: `archive_data_asset(data_asset_id, archive=True)`, `delete_data_asset(data_asset_id)` +- REST: `PATCH /data_assets/{id}/archive?archive=true`, `DELETE /data_assets/{id}` + +## Transfer + +Admin-only transfer method: + +- SDK: `client.data_assets.transfer_data_asset(data_asset_id, transfer_params)` +- REST: `POST /data_assets/{id}/transfer` + +`TransferDataParams` fields: + +- `target` +- `force` diff --git a/codeocean/references/errors-http-and-sdk.md b/codeocean/references/errors-http-and-sdk.md new file mode 100644 index 0000000..c6dd2b7 --- /dev/null +++ b/codeocean/references/errors-http-and-sdk.md @@ -0,0 +1,207 @@ +# Errors: HTTP and SDK + +Use this reference when the request itself failed: curl returned a non-2xx response, the SDK raised `codeocean.Error`, or the MCP tool call failed before returning a resource object. + +## 1. Source of Truth + +Official API errors documentation says Code Ocean uses conventional HTTP response codes: + +- `2xx`: success +- `4xx`: request failed given the information provided +- `5xx`: server-side failure + +The official error summaries are: + +- `200 OK`: request succeeded +- `204 No Content`: request succeeded with no body +- `400 Bad Request`: missing required parameter, misspelled field, or bad format +- `401 Unauthorized`: no valid access token provided +- `403 Forbidden`: token lacks permission for the request +- `404 Not Found`: requested resource does not exist +- `429 Too Many Requests`: Computation API may be overloaded; back off before retrying +- `500`, `502`, `503`, `504`: Code Ocean server issue + +Sources: + +- +- + +## 2. What the Python SDK Raises + +The local SDK wraps HTTP failures in `codeocean.Error`. + +From [`error.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/error.py:1): + +- `Error.status_code`: HTTP status code +- `Error.message`: derived from response JSON `message` if present, otherwise raw response text +- `Error.data`: parsed JSON body if the response body is JSON, otherwise `None` +- `Error.http_err`: underlying `requests.HTTPError` + +Meaning for agents: + +- Prefer `e.message` as the user-facing explanation +- Inspect `e.data` for structured API details +- Use `e.status_code` to choose the next action + +Example: + +```python +from codeocean import CodeOcean, Error + +try: + client = CodeOcean(domain="https://codeocean.acme.com", token="cop_xxxxx") + capsule = client.capsules.get_capsule("bad-id") +except Error as e: + print(e.status_code) + print(e.message) + print(e.data) +``` + +## 3. What Each Error Usually Means + +### `400 Bad Request` + +What it means from the docs: + +- the request shape is invalid +- a required field is missing +- a field name is wrong +- a value is badly formatted + +How to interpret it in practice: + +- wrong JSON shape +- wrong enum value +- using query fields where structured params are required +- sending IDs where an object list is required, or vice versa +- omitting SDK-required fields such as `name`, `tags`, or `mount` in local `DataAssetParams` + +Typical fix: + +- re-check the exact method/tool signature +- compare the body to the model fields in the SDK +- compare attach vs detach payload shapes + +### `401 Unauthorized` + +What it means from the docs: + +- no valid access token provided + +How an agent should explain it: + +- the token is missing, invalid, malformed, or revoked +- in curl, Basic Auth may be wrong +- in MCP/server config, `CODEOCEAN_TOKEN` may not be set correctly + +Typical fix: + +- verify the token exists +- verify Basic Auth uses the token as username and no password +- regenerate the token if it may have been lost or revoked + +### `403 Forbidden` + +What it means from the docs: + +- the token does not have permission to perform the request + +How an agent should explain it: + +- the token scope is too narrow +- the user can authenticate, but lacks access to this resource +- the user may have read access where write access is required + +Typical fix: + +- check token scopes +- check resource permissions +- for data assets, ensure Datasets scope is present +- for capsules/computations, ensure Capsule scope is present + +### `404 Not Found` + +What it means from the docs: + +- the requested resource does not exist + +How an agent should explain it: + +- the ID may be wrong +- the path may be wrong +- the resource may exist but not be accessible to this token + +Typical fix: + +- verify the ID came from a fresh search/get response +- verify file paths for `.../results/urls` or `.../files/urls` +- confirm the token can access the resource + +### `429 Too Many Requests` + +What it means from the docs: + +- the Computation API may be overloaded + +How an agent should explain it: + +- the issue is load or rate pressure, not bad input + +Typical fix: + +- retry with backoff +- reduce polling frequency +- avoid tight retry loops + +### `500`, `502`, `503`, `504` + +What it means from the docs: + +- Code Ocean servers have an issue + +How an agent should explain it: + +- the request may be correct, but the platform failed to process it + +Typical fix: + +- retry later +- retry with backoff if the operation is idempotent or safe to repeat +- avoid rewriting the payload unless there is separate evidence the request shape is wrong + +## 4. MCP-Specific Error Meaning + +MCP tool failures can come from three places: + +1. the underlying API returned an HTTP error +2. the SDK raised `codeocean.Error` +3. the MCP helper itself failed locally + +Special case from the local MCP server: + +- `download_and_read_a_file_from_computation(...)` +- `download_and_read_a_file_from_data_asset(...)` + +These helpers call a local downloader that catches `requests` exceptions and returns a string starting with `Download error:` instead of raising. + +Meaning: + +- if the tool returns text beginning with `Download error:`, treat it as a transport/download failure, not as successful file content + +## 5. Agent Decision Rules + +When you see an HTTP or SDK error, explain both: + +- what the status code means according to the docs +- what it most likely means in the current request + +Recommended mapping: + +- `400`: request shape problem, field/value mismatch, or missing required input +- `401`: auth/token problem +- `403`: scope/permission problem +- `404`: wrong ID/path or inaccessible resource +- `429`: retry later with backoff +- `5xx`: server/platform problem; retry later + +Do not confuse these with resource-state failures. If the API call succeeded and returned a computation or data asset object, use [errors-resource-states.md](errors-resource-states.md) instead. diff --git a/codeocean/references/errors-resource-states.md b/codeocean/references/errors-resource-states.md new file mode 100644 index 0000000..6fcb5ed --- /dev/null +++ b/codeocean/references/errors-resource-states.md @@ -0,0 +1,199 @@ +# Errors: Resource States + +Use this reference when the API request itself succeeded, but the returned resource is in a failed or unusable state. + +This is different from HTTP/API errors: + +- request failure: use [errors-http-and-sdk.md](errors-http-and-sdk.md) +- returned object in bad state: use this file + +## 1. Computation Failures + +Relevant fields from the local SDK `Computation` model: + +- `state` +- `end_status` +- `exit_code` +- `has_results` + +Enums: + +- `state`: `initializing`, `running`, `finalizing`, `completed`, `failed` +- `end_status`: `succeeded`, `failed`, `stopped` + +### How to interpret them + +#### `state=failed` + +Meaning: + +- the run did not complete successfully +- this is already a terminal failure at the computation lifecycle level + +Agent explanation: + +- the API call worked, but Code Ocean reports that the run itself failed + +#### `state=completed` and `end_status=succeeded` + +Meaning: + +- normal successful completion + +#### `state=completed` and `end_status=failed` + +Meaning: + +- the run reached a terminal state, but the job outcome was failure + +Agent explanation: + +- the scheduler/run lifecycle finished, but the actual computation outcome is failure + +#### `state=completed` and `end_status=stopped` + +Meaning: + +- the run did not fail by computation error; it was stopped or terminated + +Agent explanation: + +- the computation ended early because it was stopped, deleted while running, or otherwise interrupted + +#### Non-zero `exit_code` + +Meaning: + +- the executed process ended unsuccessfully + +Agent explanation: + +- the run reached the end of execution, but the underlying process returned a failure code + +### What the agent should do + +If a computation is not successful: + +1. report `state`, `end_status`, and `exit_code` if present +2. distinguish between platform request failure and run failure +3. do not say “the API failed” when the object was returned successfully +4. check `has_results` before assuming outputs exist + +Suggested wording: + +- “The request succeeded, but the computation ended with `end_status=failed`.” +- “The computation object exists, but the run was `stopped`, so results may be incomplete or absent.” + +## 2. Data Asset Failures + +Relevant fields from the local SDK `DataAsset` model: + +- `state` +- `failure_reason` +- `files` +- `size` + +Enum: + +- `state`: `draft`, `ready`, `failed` + +### How to interpret them + +#### `state=draft` + +Meaning: + +- creation/indexing is still in progress + +Agent explanation: + +- the data asset exists, but it is not ready for normal file operations yet + +#### `state=ready` + +Meaning: + +- the asset is ready to use + +#### `state=failed` + +Meaning: + +- Code Ocean accepted the creation request, but asset creation failed later + +Agent explanation: + +- this is not a bad API request anymore; it is a failed asynchronous creation outcome +- `failure_reason` is the first field to inspect + +### `failure_reason` + +Meaning: + +- the platform’s explanation for why data asset creation failed + +Agent behavior: + +- if present, surface it directly as the primary explanation +- do not replace it with a generic guess unless it is empty + +Suggested wording: + +- “The create request was accepted, but the data asset later entered `failed` state. `failure_reason` says: ...” + +## 3. Polling Semantics + +The SDK polling helpers reflect resource-state outcomes: + +- `client.computations.wait_until_completed(computation, ...)` +- `client.data_assets.wait_until_ready(data_asset, ...)` + +Important nuance: + +- these methods can return a terminal object that is unsuccessful +- terminal does not always mean successful + +Meaning for agents: + +- after polling, always inspect the returned object +- never assume “wait finished” means “the run/asset succeeded” + +## 4. File Access Failures Caused by State + +State-related file problems usually mean: + +- computation has no usable results yet +- data asset is still `draft` +- data asset ended in `failed` +- requested file path does not exist within the returned result/data asset tree + +Agent rule: + +- first verify the resource state +- then verify the path using `list_computation_results(...)` or `list_data_asset_files(...)` + +## 5. Practical Interpretation Guide + +When a resource object is returned, explain the failure in this order: + +1. Was the request itself successful? +2. What terminal or current state is the resource in? +3. Is there a structured reason field such as `failure_reason`? +4. Do results/files actually exist? + +Recommended summaries: + +- `Computation.state=failed`: “The run failed.” +- `Computation.end_status=failed`: “The run finished, but the outcome was failure.” +- `Computation.end_status=stopped`: “The run was stopped before successful completion.” +- `DataAsset.state=draft`: “The asset exists but is still being created/indexed.” +- `DataAsset.state=failed`: “The asset creation request was accepted, but the asset later failed.” + +## 6. What Not to Say + +Avoid these incorrect explanations: + +- “The API failed” when a computation/data asset object was returned normally +- “The data asset does not exist” when it is actually in `draft` +- “The file API is broken” before checking whether the resource is ready +- “The token is invalid” for computation/data-asset terminal failures without an HTTP `401` diff --git a/codeocean/references/mcp-guide.md b/codeocean/references/mcp-guide.md new file mode 100644 index 0000000..dd06144 --- /dev/null +++ b/codeocean/references/mcp-guide.md @@ -0,0 +1,91 @@ +# Code Ocean MCP Guide + +## Overview + +The current local MCP server exposes **25** tools: + +- Capsule tools: `search_capsules`, `get_capsule`, `list_computations`, `attach_data_assets`, `detach_data_assets`, `get_capsule_app_panel` +- Pipeline search tool: `search_pipelines` +- Computation tools: `get_computation`, `run_capsule`, `wait_until_completed`, `list_computation_results`, `get_result_file_urls`, `download_and_read_a_file_from_computation`, `rename_computation`, `delete_computation`, `attach_computation_data_assets`, `detach_computation_data_assets` +- Data asset tools: `search_data_assets`, `get_data_asset`, `get_data_asset_file_urls`, `download_and_read_a_file_from_data_asset`, `list_data_asset_files`, `update_metadata`, `wait_until_ready`, `create_data_asset` +- Custom metadata tool: `get_custom_metadata` + +Environment variables from [`server.py`](/Users/drorhilman/codeocean/codeocean-mcp-server/src/codeocean_mcp_server/server.py:1): + +- `CODEOCEAN_DOMAIN` +- `CODEOCEAN_TOKEN` +- `AGENT_ID` with default `"AI Agent"` + +## Core Patterns + +### Find and run a capsule + +1. `search_capsules(search_params={...})` +2. `get_capsule(capsule_id)` if needed +3. `get_capsule_app_panel(capsule_id)` +4. `run_capsule(run_params={...})` +5. `wait_until_completed(computation_id)` +6. `list_computation_results(computation_id)` +7. `get_result_file_urls(...)` or `download_and_read_a_file_from_computation(...)` + +### Create a data asset and wait for readiness + +1. `create_data_asset(data_asset_params={name, tags, mount, ...})` +2. `wait_until_ready(data_asset_object, polling_interval=5, timeout=None)` + +`wait_until_ready` is an MCP wrapper around the SDK method that first reconstructs a full `DataAsset` object, then calls `client.data_assets.wait_until_ready(...)`. + +### Wait for a computation + +`wait_until_completed(computation_id)` is also a wrapper. It first calls `get_computation(computation_id)`, then passes the returned object into `client.computations.wait_until_completed(...)`. + +### Run a pipeline + +What is supported directly in MCP today: + +1. `search_pipelines(search_params={...})` +2. `run_capsule(run_params={pipeline_id: ..., processes: [...]})` +3. `wait_until_completed(...)` + +Important caveat: + +- The MCP server does **not** expose separate `get_pipeline`, `get_pipeline_app_panel`, `attach_pipeline_data_assets`, or `detach_pipeline_data_assets` tools. +- The MCP tools named `get_capsule`, `get_capsule_app_panel`, `list_computations`, `attach_data_assets`, and `detach_data_assets` are implemented via `client.capsules.*`, not `client.pipelines.*`. +- Use SDK or REST when you need guaranteed `/pipelines/...` routing for pipeline inspection or management. + +## Pagination and Compact Search + +MCP search responses have this envelope: + +```json +{ + "items": [...], + "has_more": true, + "next_token": "abc123", + "item_count": 20, + "field_names": {"n": "name", "s": "slug", "d": "description", "t": "tags"} +} +``` + +Notes: + +- `field_names` is present only when `include_field_names=true` +- capsule/pipeline items use `id`, `n`, `s`, `d`, `t` +- data asset items use `id`, `n`, `d`, `t` +- descriptions are truncated to 200 chars +- tags are limited to 10 entries + +Pagination usage: + +1. Call `search_*` with `search_params={query: "...", limit: N}` +2. Read `has_more` and `next_token` +3. Call again with the same `search_params` plus `next_token` + +## Anti-patterns + +- Calling `run_capsule` and then immediately trying to read results without `wait_until_completed` +- Passing a data asset ID string into `wait_until_ready`; it requires the full `DataAsset` object +- Passing plain ID strings into `attach_data_assets` or `attach_computation_data_assets`; attach expects objects +- Treating `type`, `origin`, or `ownership` as data-asset query fields; they are structured search params +- Assuming MCP has full pipeline management parity because it has `search_pipelines`; it does not +- Assuming `download_and_read_*` returns whole files; it reads only the first `50_000` bytes diff --git a/codeocean/references/mcp-server-install.md b/codeocean/references/mcp-server-install.md new file mode 100644 index 0000000..0fd8d0d --- /dev/null +++ b/codeocean/references/mcp-server-install.md @@ -0,0 +1,92 @@ +# MCP Server Installation + +Package name from local `pyproject.toml`: `codeocean-mcp-server` + +Python requirement from local `pyproject.toml`: `>=3.10` + +Recommended launcher from the local README: `uvx codeocean-mcp-server` + +## Basic Config Shape + +All client configs pass these env vars: + +- `CODEOCEAN_DOMAIN` +- `CODEOCEAN_TOKEN` +- optional `AGENT_ID` + +## Claude Desktop + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "", + "AGENT_ID": "Claude Desktop" + } + } + } +} +``` + +## Visual Studio Code + +```json +{ + "mcp": { + "inputs": [ + { + "type": "promptString", + "id": "codeocean-token", + "description": "Code Ocean API Key", + "password": true + } + ], + "servers": { + "codeocean": { + "type": "stdio", + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "${input:codeocean-token}", + "AGENT_ID": "VS Code" + } + } + } + } +} +``` + +## Cline / Roo Code / Cursor / Windsurf + +The local README uses the same executable and env vars for all of them: + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "", + "AGENT_ID": "Client Name" + } + } + } +} +``` + +## Local Testing + +The local README’s inspector command is: + +```bash +npx @modelcontextprotocol/inspector uv tool run codeocean-mcp-server +``` + +Set `CODEOCEAN_DOMAIN` and `CODEOCEAN_TOKEN` in the environment before running it. diff --git a/codeocean/references/mcp-tools-catalog.md b/codeocean/references/mcp-tools-catalog.md new file mode 100644 index 0000000..504803f --- /dev/null +++ b/codeocean/references/mcp-tools-catalog.md @@ -0,0 +1,210 @@ +# MCP Tools Catalog + +Complete catalog of the **25** MCP tools exposed by the current local Code Ocean MCP server. + +## 1. Capsule Search and Management + +### `search_capsules` + +- Params: + - `search_params` + - `include_field_names` +- `search_params` fields: `query`, `next_token`, `offset`, `limit`, `sort_field`, `sort_order`, `ownership`, `status`, `favorite`, `archived`, `filters` +- Returns: `{items, has_more, next_token, item_count, field_names?}` + +### `search_pipelines` + +- Params: + - `search_params` + - `include_field_names` +- Uses the same `CapsuleSearchParams` shape as `search_capsules` +- Returns the same compact envelope as `search_capsules` + +### `get_capsule` + +- Params: + - `capsule_id` +- Returns: full `Capsule` +- Current implementation calls `client.capsules.get_capsule(capsule_id)` + +### `list_computations` + +- Params: + - `capsule_id` +- Returns: `list[Computation]` +- Current implementation calls `client.capsules.list_computations(capsule_id)` + +### `attach_data_assets` + +- Params: + - `capsule_id` + - `attach_params` +- `attach_params`: `list[{id, mount?}]` +- Returns: `list[DataAssetAttachResults]` with fields such as `id`, `mount_state`, `job_id`, `external`, `ready`, `mount` + +### `detach_data_assets` + +- Params: + - `capsule_id` + - `data_assets` +- `data_assets`: `list[str]` +- Returns: `None` + +### `get_capsule_app_panel` + +- Params: + - `capsule_id` + - `version` +- Returns: `AppPanel` +- Current implementation calls `client.capsules.get_capsule_app_panel(...)` + +## 2. Computations + +### `get_computation` + +- Params: + - `computation_id` +- Returns: `Computation` + +### `run_capsule` + +- Params: + - `run_params` +- `run_params` fields: `capsule_id`, `pipeline_id`, `version`, `resume_run_id`, `nextflow_profile`, `data_assets`, `parameters`, `named_parameters`, `processes` +- Returns: `Computation` + +### `wait_until_completed` + +- Params: + - `computation_id` +- Returns: terminal-state `Computation` +- Wrapper behavior: fetches the computation object first, then passes it to the SDK polling method + +### `list_computation_results` + +- Params: + - `computation_id` +- Returns: `Folder` + +### `get_result_file_urls` + +- Params: + - `computation_id` + - `file_path` +- Returns: `FileURLs {download_url, view_url}` + +### `download_and_read_a_file_from_computation` + +- Params: + - `computation_id` + - `file_path` +- Returns: decoded file content string +- Reads the first `50_000` bytes from the remote response + +### `rename_computation` + +- Params: + - `computation_id` + - `name` +- Returns: `None` + +### `delete_computation` + +- Params: + - `computation_id` +- Returns: `None` + +### `attach_computation_data_assets` + +- Params: + - `computation_id` + - `attach_params` +- `attach_params`: `list[{id, mount?}]` +- Returns: `list[DataAssetAttachResults]` + +### `detach_computation_data_assets` + +- Params: + - `computation_id` + - `data_assets` +- `data_assets`: `list[str]` +- Returns: `None` + +## 3. Data Assets + +### `search_data_assets` + +- Params: + - `search_params` + - `include_field_names` +- `search_params` fields: `query`, `next_token`, `offset`, `limit`, `sort_field`, `sort_order`, `type`, `ownership`, `origin`, `favorite`, `archived`, `filters` +- Returns: `{items, has_more, next_token, item_count, field_names?}` + +### `get_data_asset` + +- Params: + - `data_asset_id` +- Returns: full `DataAsset` + +### `get_data_asset_file_urls` + +- Params: + - `data_asset_id` + - `file_path` +- Returns: `FileURLs {download_url, view_url}` + +### `download_and_read_a_file_from_data_asset` + +- Params: + - `data_asset_id` + - `file_path` +- Returns: decoded file content string +- Reads the first `50_000` bytes from the remote response + +### `list_data_asset_files` + +- Params: + - `data_asset_id` + - `path` +- Returns: `Folder` + +### `update_metadata` + +- Params: + - `data_asset_id` + - `update_params` +- `update_params` fields: `name`, `description`, `tags`, `mount`, `custom_metadata` +- Returns: `DataAsset` + +### `wait_until_ready` + +- Params: + - `data_asset` + - `polling_interval` + - `timeout` +- Returns: terminal-state `DataAsset` +- `data_asset` must be the full object, not an ID string + +### `create_data_asset` + +- Params: + - `data_asset_params` +- `data_asset_params` fields: `name`, `tags`, `mount`, `description`, `source`, `target`, `custom_metadata`, `data_asset_ids`, `results_info` +- Returns: `DataAsset` + +## 4. Custom Metadata + +### `get_custom_metadata` + +- Params: none +- Returns: `CustomMetadata` + +## 5. Pipeline Caveat + +The MCP server has only one pipeline-specific tool name: `search_pipelines`. + +- There is no separate MCP `get_pipeline` +- There is no separate MCP `get_pipeline_app_panel` +- There are no separate MCP pipeline attach/detach tools + +For exact `/pipelines/...` operations, use the SDK or REST guides. diff --git a/codeocean/references/permissions.md b/codeocean/references/permissions.md new file mode 100644 index 0000000..9346862 --- /dev/null +++ b/codeocean/references/permissions.md @@ -0,0 +1,93 @@ +# Permissions + +Permissions are available in the SDK and REST API, not in the MCP server. + +## Model + +Types from [`models/components.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/models/components.py:1): + +- `Permissions` +- `UserPermissions` +- `GroupPermissions` +- `UserRole` +- `GroupRole` +- `EveryoneRole` + +`Permissions` fields: + +- `users` +- `groups` +- `everyone` +- `share_assets` + +Field meanings: + +- `users`: list of `{email, role}` +- `groups`: list of `{group, role}` +- `everyone`: org-wide access level +- `share_assets`: whether related assets are shared too + +Role enums: + +- `UserRole`: `owner`, `editor`, `viewer` +- `GroupRole`: `owner`, `editor`, `viewer`, `discoverable` +- `EveryoneRole`: `viewer`, `discoverable`, `none` + +## SDK Methods + +Capsules: + +- `client.capsules.get_permissions(capsule_id)` +- `client.capsules.update_permissions(capsule_id, permissions)` + +Pipelines: + +- `client.pipelines.get_permissions(pipeline_id)` +- `client.pipelines.update_permissions(pipeline_id, permissions)` + +Data assets: + +- `client.data_assets.get_permissions(data_asset_id)` +- `client.data_assets.update_permissions(data_asset_id, permissions)` + +## REST Routes + +- `GET /capsules/{id}/permissions` +- `POST /capsules/{id}/permissions` +- `GET /pipelines/{id}/permissions` +- `POST /pipelines/{id}/permissions` +- `GET /data_assets/{id}/permissions` +- `POST /data_assets/{id}/permissions` + +## Example + +```python +from codeocean.models.components import ( + Permissions, + UserPermissions, + GroupPermissions, + UserRole, + GroupRole, + EveryoneRole, +) + +permissions = Permissions( + users=[UserPermissions(email="user@example.com", role=UserRole.Editor)], + groups=[GroupPermissions(group="research-team", role=GroupRole.Viewer)], + everyone=EveryoneRole.Discoverable, + share_assets=True, +) + +client.capsules.update_permissions(capsule_id, permissions) +``` + +Equivalent JSON: + +```json +{ + "users": [{"email": "user@example.com", "role": "editor"}], + "groups": [{"group": "research-team", "role": "viewer"}], + "everyone": "discoverable", + "share_assets": true +} +``` diff --git a/codeocean/references/pipelines.md b/codeocean/references/pipelines.md new file mode 100644 index 0000000..b965057 --- /dev/null +++ b/codeocean/references/pipelines.md @@ -0,0 +1,104 @@ +# Pipelines Reference + +## Overview + +The SDK has a dedicated `Pipelines` client in [`pipeline.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/pipeline.py:1). + +Supported SDK methods: + +- `get_pipeline` +- `delete_pipeline` +- `get_pipeline_app_panel` +- `list_computations` +- `get_permissions` +- `update_permissions` +- `attach_data_assets` +- `detach_data_assets` +- `archive_pipeline` +- `search_pipelines` +- `search_pipelines_iterator` + +These methods route through `/pipelines/...` via a `Capsules(..., _route="pipelines")` helper. + +## SDK and REST Operations + +| Operation | SDK | REST | +|---|---|---| +| Search | `client.pipelines.search_pipelines(...)` | `POST /pipelines/search` | +| Get | `client.pipelines.get_pipeline(id)` | `GET /pipelines/{id}` | +| App panel | `client.pipelines.get_pipeline_app_panel(id)` | `GET /pipelines/{id}/app_panel` | +| List computations | `client.pipelines.list_computations(id)` | `GET /pipelines/{id}/computations` | +| Permissions get/update | `get_permissions` / `update_permissions` | `GET` / `POST /pipelines/{id}/permissions` | +| Attach/detach data assets | `attach_data_assets` / `detach_data_assets` | `POST` / `DELETE /pipelines/{id}/data_assets` | +| Archive | `archive_pipeline` | `PATCH /pipelines/{id}/archive?archive=true` | +| Delete | `delete_pipeline` | `DELETE /pipelines/{id}` | + +## MCP Caveat + +The MCP server does **not** expose separate pipeline getter/app-panel/attach/detach tools. + +What MCP supports directly: + +- `search_pipelines(...)` +- `run_capsule(run_params={pipeline_id: ...})` + +The MCP tools named `get_capsule`, `get_capsule_app_panel`, `list_computations`, `attach_data_assets`, and `detach_data_assets` are implemented against `client.capsules.*`, not `client.pipelines.*`. Do not document them as guaranteed pipeline tools. + +## Running Pipelines + +Pipelines run through `client.computations.run_capsule(...)` with `pipeline_id`. + +Relevant `RunParams` fields: + +- `pipeline_id` +- `version` +- `resume_run_id` +- `nextflow_profile` +- `data_assets` +- `processes` + +`PipelineProcessParams` fields: + +- `name` +- `parameters` +- `named_parameters` + +SDK example: + +```python +from codeocean.computation import RunParams, DataAssetsRunParam, PipelineProcessParams, NamedRunParam + +computation = client.computations.run_capsule( + RunParams( + pipeline_id="pipeline-uuid", + data_assets=[DataAssetsRunParam(id="data-uuid", mount="Reference")], + processes=[ + PipelineProcessParams(name="process1", parameters=["val1"]), + PipelineProcessParams( + name="process2", + named_parameters=[NamedRunParam(param_name="threshold", value="0.5")], + ), + ], + ) +) +``` + +REST example: + +```bash +curl -u "$TOKEN:" \ + -H "Content-Type: application/json" \ + -d '{ + "pipeline_id":"pipeline-uuid", + "data_assets":[{"id":"data-uuid","mount":"Reference"}], + "processes":[ + {"name":"process1","parameters":["val1"]}, + {"name":"process2","named_parameters":[{"param_name":"threshold","value":"0.5"}]} + ] + }' \ + "$DOMAIN/api/v1/computations" +``` + +## App Panel Process Guidance + +Pipeline app panels can include `processes`, represented by `AppPanelProcess`, to describe per-process categories and parameters. Use the SDK or REST app-panel routes for reliable pipeline inspection. diff --git a/codeocean/references/sdk-guide.md b/codeocean/references/sdk-guide.md new file mode 100644 index 0000000..189170a --- /dev/null +++ b/codeocean/references/sdk-guide.md @@ -0,0 +1,228 @@ +# Code Ocean Python SDK Guide + +## Installation + +Public docs: `pip install -U codeocean` + +Compatibility note: + +- Public docs currently say Python `>=3.11` and Code Ocean `>=2.19` +- Local `pyproject.toml` declares Python `>=3.9` +- Local client code sends `Min-Server-Version: 4.3.0` + +Document the split instead of collapsing it. + +## Client Setup + +```python +from codeocean import CodeOcean + +client = CodeOcean( + domain="https://codeocean.acme.com", + token="YOUR_API_TOKEN", + retries=0, + agent_id="my-agent", # optional +) +``` + +Client properties from [`client.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/client.py:1): + +- `client.capsules` +- `client.pipelines` +- `client.computations` +- `client.data_assets` +- `client.custom_metadata` + +For SDK exception handling and error interpretation, also load [errors-http-and-sdk.md](errors-http-and-sdk.md). + +## Imports Reference + +```python +from codeocean import CodeOcean +``` + +```python +from codeocean.capsule import CapsuleSearchParams +``` + +```python +from codeocean.computation import ( + RunParams, + DataAssetsRunParam, + NamedRunParam, + PipelineProcessParams, +) +``` + +```python +from codeocean.data_asset import ( + DataAssetParams, + DataAssetSearchParams, + DataAssetUpdateParams, + TransferDataParams, + Source, + AWSS3Source, + GCPCloudStorageSource, + ComputationSource, + CloudWorkstationSource, + Target, + AWSS3Target, +) +``` + +Permissions imports come from `codeocean.models.components`: + +```python +from codeocean.models.components import ( + Permissions, + UserPermissions, + GroupPermissions, + UserRole, + GroupRole, + EveryoneRole, +) +``` + +SDK error type: + +```python +from codeocean import Error +``` + +## Pagination + +Search result objects expose: + +- `.results` +- `.has_more` +- `.next_token` + +Iterator helpers: + +- `client.capsules.search_capsules_iterator(...)` +- `client.pipelines.search_pipelines_iterator(...)` +- `client.data_assets.search_data_assets_iterator(...)` + +Manual pagination: + +```python +from codeocean.capsule import CapsuleSearchParams + +params = CapsuleSearchParams(query="tag:genomics", limit=20) +results = client.capsules.search_capsules(params) + +for capsule in results.results: + print(capsule.id, capsule.name) + +while results.has_more: + params = CapsuleSearchParams(query="tag:genomics", limit=20, next_token=results.next_token) + results = client.capsules.search_capsules(params) + for capsule in results.results: + print(capsule.id, capsule.name) +``` + +## Polling + +The SDK polling methods take full model objects, not IDs. + +```python +completed = client.computations.wait_until_completed( + computation, + polling_interval=5, + timeout=300, +) +``` + +```python +ready_asset = client.data_assets.wait_until_ready( + data_asset, + polling_interval=5, + timeout=300, +) +``` + +Both methods enforce a minimum `polling_interval` of `5`. + +If polling returns a terminal object in a bad state, that is not the same as an HTTP exception. Use [errors-resource-states.md](errors-resource-states.md) to interpret: + +- `Computation.state`, `Computation.end_status`, `Computation.exit_code` +- `DataAsset.state`, `DataAsset.failure_reason` + +## Example: Run Capsule, Wait, Create Data Asset + +```python +import os + +from codeocean import CodeOcean +from codeocean.computation import RunParams, DataAssetsRunParam +from codeocean.data_asset import DataAssetParams, Source, ComputationSource + +client = CodeOcean( + domain=os.environ["CODEOCEAN_DOMAIN"], + token=os.environ["CODEOCEAN_TOKEN"], +) + +computation = client.computations.run_capsule( + RunParams( + capsule_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890", + parameters=["hg38", "8"], + data_assets=[DataAssetsRunParam(id="d1e2f3a4-b5c6-7890-abcd-ef1234567890", mount="input_data")], + ) +) + +completed = client.computations.wait_until_completed(computation, timeout=3600) + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="RNA-seq Analysis Results", + mount="rna_seq_results", + tags=["rna-seq", "results", "automated"], + description=f"Output of computation {completed.id}", + source=Source(computation=ComputationSource(id=completed.id)), + ) +) + +ready_asset = client.data_assets.wait_until_ready(data_asset, timeout=600) +print(ready_asset.id, ready_asset.state) +``` + +## Example: Create Data Asset from S3 + +```python +from codeocean.data_asset import DataAssetParams, Source, AWSS3Source + +data_asset = client.data_assets.create_data_asset( + DataAssetParams( + name="Reference Genome hg38", + mount="reference_genome", + tags=["reference", "genome", "hg38"], + description="Human reference genome GRCh38 from S3", + source=Source( + aws=AWSS3Source(bucket="my-genomics-bucket", prefix="reference/hg38/"), + ), + ) +) +``` + +## Example: Run Pipeline + +```python +from codeocean.computation import RunParams, DataAssetsRunParam, PipelineProcessParams, NamedRunParam + +computation = client.computations.run_capsule( + RunParams( + pipeline_id="p1q2r3s4-t5u6-7890-abcd-ef1234567890", + data_assets=[DataAssetsRunParam(id="d1e2f3a4-b5c6-7890-abcd-ef1234567890", mount="raw_data")], + processes=[ + PipelineProcessParams(name="alignment", parameters=["STAR", "16"]), + PipelineProcessParams( + name="quantification", + named_parameters=[NamedRunParam(param_name="method", value="salmon")], + ), + ], + ) +) + +completed = client.computations.wait_until_completed(computation, timeout=7200) +print(completed.id, completed.state, completed.end_status) +``` diff --git a/codeocean/references/search-and-pagination.md b/codeocean/references/search-and-pagination.md new file mode 100644 index 0000000..b55dbca --- /dev/null +++ b/codeocean/references/search-and-pagination.md @@ -0,0 +1,155 @@ +# Search and Pagination + +## Query Syntax + +Free text matches weighted fields. + +`field:value` filters are also supported. + +Rules from the local SDK model metadata: + +- same field repeated = OR +- different fields = AND +- quotes for exact phrases +- no explicit `OR` +- no wildcards +- case insensitive + +Capsule query fields: + +- `id` +- `name` +- `doi` +- `tag` +- `field` +- `affiliation` +- `journal` +- `article` +- `author` + +Data asset query fields: + +- `name` +- `tag` +- `run_script` +- `commit_id` +- `contained_data_id` + +Important: `type`, `origin`, and `ownership` are structured params, not query fields. + +## CapsuleSearchParams + +Fields: + +- `query` +- `next_token` +- `offset` +- `limit` +- `sort_field` +- `sort_order` +- `ownership` +- `status` +- `favorite` +- `archived` +- `filters` + +Defaults and limits from model metadata: + +- default `limit`: `100` +- max `limit`: `1000` + +## DataAssetSearchParams + +Fields: + +- all pagination fields above +- `type` +- `ownership` +- `origin` +- `favorite` +- `archived` +- `filters` + +Data asset sort fields: + +- `created` +- `type` +- `name` +- `size` + +## SearchFilter + +`SearchFilter` fields: + +- `key` +- `value` +- `values` +- `range` +- `exclude` + +`range` uses `SearchFilterRange` with: + +- `min` +- `max` + +## SDK Pagination + +Search result objects expose: + +- `.results` +- `.has_more` +- `.next_token` + +Iterator helpers: + +- `search_capsules_iterator` +- `search_pipelines_iterator` +- `search_data_assets_iterator` + +Manual example: + +```python +from codeocean.capsule import CapsuleSearchParams + +params = CapsuleSearchParams(query="RNA-seq", limit=100) +results = client.capsules.search_capsules(params) + +for capsule in results.results: + process(capsule) + +while results.has_more: + params = CapsuleSearchParams(query="RNA-seq", limit=100, next_token=results.next_token) + results = client.capsules.search_capsules(params) + for capsule in results.results: + process(capsule) +``` + +## MCP Compact Search + +Search envelope: + +- `items` +- `has_more` +- `next_token` +- `item_count` +- optional `field_names` + +Capsule/pipeline item abbreviations: + +- `id` +- `n` = `name` +- `s` = `slug` +- `d` = `description` +- `t` = `tags` + +Data asset item abbreviations: + +- `id` +- `n` = `name` +- `d` = `description` +- `t` = `tags` + +Truncation behavior from `search.py`: + +- descriptions normalized and truncated to `200` chars with `"...(more)"` +- tags limited to `10` entries, with truncation marker `"..more.."` if needed diff --git a/codeocean/references/setup-and-auth.md b/codeocean/references/setup-and-auth.md new file mode 100644 index 0000000..4af40f7 --- /dev/null +++ b/codeocean/references/setup-and-auth.md @@ -0,0 +1,87 @@ +# Setup and Authentication + +## Generate an API Token + +From the public authentication guide: + +1. Sign in to Code Ocean +2. Open `Account` +3. Open `Access Tokens` +4. Click `Generate New Token` +5. Provide a token name +6. Select scopes +7. Click `Add Token` +8. Copy the token immediately, or use `Copy Token & Create Secret` +9. Click `Save Changes` + +The public docs explicitly note that the token is shown only once at creation time. + +## Environment Variables + +SDK and MCP commonly use: + +```bash +export CODEOCEAN_DOMAIN="https://codeocean.acme.com" +export CODEOCEAN_TOKEN="cop_xxxxx" +export AGENT_ID="my-agent" +``` + +Source of truth: + +- `CODEOCEAN_DOMAIN` and `CODEOCEAN_TOKEN` are required by MCP `server.py` +- `AGENT_ID` is optional in MCP `server.py`, defaulting to `"AI Agent"` +- `CodeOcean(...)` accepts `agent_id` and sends it as an `Agent-Id` header when provided + +## Authentication by Access Method + +REST: + +```bash +curl -u "$CODEOCEAN_TOKEN:" \ + "$CODEOCEAN_DOMAIN/api/v1/capsules/$CAPSULE_ID" +``` + +The public auth guide states that Basic Auth uses the token as the username and no password. + +SDK: + +```python +from codeocean import CodeOcean + +client = CodeOcean( + domain="https://codeocean.acme.com", + token="cop_xxxxx", + agent_id="my-agent", # optional +) +``` + +The SDK does not auto-read env vars; pass them explicitly if you store them in the environment. + +MCP: + +```json +{ + "mcpServers": { + "codeocean": { + "command": "uvx", + "args": ["codeocean-mcp-server"], + "env": { + "CODEOCEAN_DOMAIN": "https://codeocean.acme.com", + "CODEOCEAN_TOKEN": "cop_xxxxx", + "AGENT_ID": "Claude Desktop" + } + } + } +} +``` + +## Compatibility + +The sources currently diverge: + +- Public Python SDK docs say Python `>=3.11` and Code Ocean `>=2.19` +- Local SDK `pyproject.toml` says Python `>=3.9` +- Local SDK client sets `Min-Server-Version: 4.3.0` +- Local MCP `pyproject.toml` says Python `>=3.10` + +Keep this discrepancy explicit whenever you mention compatibility. diff --git a/codeocean/references/user-guide/capsule.md b/codeocean/references/user-guide/capsule.md new file mode 100644 index 0000000..8652779 --- /dev/null +++ b/codeocean/references/user-guide/capsule.md @@ -0,0 +1,12 @@ +# Capsule + +A Capsule is Code Ocean's fundamental project unit. It bundles the code, data, environment (OS, packages, libraries, dependencies), and results needed to run and share a research workflow reproducibly. + +Every Capsule has seven permanent folders: Metadata, Environment, Code, Data, .codeocean, Scratch, and Results. The Capsule IDE is divided into three panels: File Navigation/App Builder, Editor, and Reproducibility. + +Use this mental model: a Capsule is the place where code is developed and where a reproducible run or cloud workstation session happens. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/cloud-workstation.md b/codeocean/references/user-guide/cloud-workstation.md new file mode 100644 index 0000000..f962b92 --- /dev/null +++ b/codeocean/references/user-guide/cloud-workstation.md @@ -0,0 +1,12 @@ +# Cloud Workstation + +A Cloud Workstation is an interactive IDE session launched inside a Capsule's compute environment. The system starts an EC2 machine and launches a Docker container using the Capsule's environment. Supported IDEs include JupyterLab, RStudio, and VS Code; the appropriate IDE package is installed automatically at launch if not already present. + +Key folders available in a Cloud Workstation: `/code`, `/data`, `/results`, `/metadata`, `/environment`, `/scratch`, and full root filesystem access. Reproducible Runs also get `/code`, `/data`, `/results`, and `/scratch`, but not `/metadata`, `/environment`, or root filesystem access. + +Use this mental model: a Cloud Workstation is interactive development mode for a Capsule, distinct from a headless reproducible run. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/computation.md b/codeocean/references/user-guide/computation.md new file mode 100644 index 0000000..c8fb7f3 --- /dev/null +++ b/codeocean/references/user-guide/computation.md @@ -0,0 +1,12 @@ +# Computation + +A Computation is a single run record for a Capsule or Pipeline. It is the object returned when you trigger a run and later used to track status, inspect results, and fetch output files. The `state` field progresses through `initializing`, `running`, `finalizing`, and `completed`. A separate `end_status` field indicates the outcome: `succeeded`, `failed`, or `stopped`. + +Each Reproducible Run and Cloud Workstation session creates a Computation. Results from completed computations can be captured as Data Assets. + +Use this mental model: a Computation is the execution instance, not the project itself. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/data-asset.md b/codeocean/references/user-guide/data-asset.md new file mode 100644 index 0000000..912d561 --- /dev/null +++ b/codeocean/references/user-guide/data-asset.md @@ -0,0 +1,12 @@ +# Data Asset + +A Data Asset is shared, versioned storage that can be attached to Capsules or Pipelines. Data Assets are mounted read-only into compute containers rather than copied, which improves sharing and performance. Types include datasets, results (captured from a computation), combined assets, and models. + +Data Assets are backed by independent cloud storage (AWS S3 or EFS with intelligent tiering). They can be created from uploaded files, external cloud storage (S3, GCS), or by capturing computation results. The `type` field is one of: `dataset`, `result`, `combined`, or `model`. They support custom metadata for organization and discovery. + +Use this mental model: a Data Asset is durable input/output storage, not executable code. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/pipeline.md b/codeocean/references/user-guide/pipeline.md new file mode 100644 index 0000000..e2776f1 --- /dev/null +++ b/codeocean/references/user-guide/pipeline.md @@ -0,0 +1,12 @@ +# Pipeline + +A Pipeline is a multi-step workflow that connects Capsules and Data Assets into reusable stages. It enables separating workflow stages, automating downstream steps, setting compute resources per step, and parallelizing work. Pipelines are backed by Nextflow scripts that Code Ocean generates and manages. + +Pipeline components include capsule steps (each referencing a Capsule), data asset connections between steps, and per-step parameter and resource configuration. Connection types between steps are: Default (items distributed to parallel instances), Collect (entire dataset available to all instances), and Flatten (each item goes to a separate parallel instance). + +Use this mental model: a Pipeline orchestrates multiple Capsules; it is not where code is primarily developed. + +Primary sources: + +- +- diff --git a/codeocean/references/user-guide/reproducible-run.md b/codeocean/references/user-guide/reproducible-run.md new file mode 100644 index 0000000..824142a --- /dev/null +++ b/codeocean/references/user-guide/reproducible-run.md @@ -0,0 +1,12 @@ +# Reproducible Run + +A Reproducible Run is Code Ocean's headless execution mode for a Capsule. It executes the Capsule's `run` file end to end without manual input, producing results consistently. The run executes inside a Docker container with access to `/code`, `/data`, `/results`, and `/scratch` folders. Unlike Cloud Workstations, Reproducible Runs do not expose `/metadata`, `/environment`, or root filesystem access. + +Each Reproducible Run creates a Computation and a results timeline entry. Results written to `/results` are preserved and can be captured as a Data Asset. + +Use this mental model: a Reproducible Run is the standard automated execution path, distinct from interactive Cloud Workstation sessions. + +Primary sources: + +- +- From b9bc4f2285f802dee91c3278de3523f51d673ade Mon Sep 17 00:00:00 2001 From: Dror Hilman Date: Wed, 22 Apr 2026 12:12:10 +0200 Subject: [PATCH 2/3] lint --- README.md | 30 +++++------ codeocean/SKILL.md | 81 +++++++++++++++-------------- codeocean/references/permissions.md | 4 +- codeocean/references/pipelines.md | 20 +++---- 4 files changed, 69 insertions(+), 66 deletions(-) diff --git a/README.md b/README.md index 14f0ce9..a55e26b 100644 --- a/README.md +++ b/README.md @@ -61,21 +61,21 @@ See `codeocean/references/mcp-server-install.md` for configs for VS Code, Cline, - **Install the entire skill folder**, not just `SKILL.md` — supporting files (references, templates, examples) are referenced by the entry point. - The fastest cross-agent option is [`gh skill`](#github-cli-universal-installer) if available. -| Agent | Install method | Exact path / command | Notes | -|-------|---------------|---------------------|-------| -| **Claude Code** | Manual folder copy | Project: `.claude/skills//`
User: `~/.claude/skills//`
Plugin: `/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). Claude watches these directories automatically. | -| **Codex** | A. Manual folder copy
B. Plugin install | A. `$CWD/.agents/skills//`
`$REPO_ROOT/.agents/skills//`
`$HOME/.agents/skills//`
`/etc/codex/skills//`
B. In-app: add from plugin directory
CLI: `/plugins` → Install plugin | Clone/download from `codeocean/skills`, copy the folder at ``. Codex supports skills natively; packaged distribution is often via plugins. | -| **Cursor** | Plugin-first | Install plugin from marketplace / team marketplace | No official direct raw GitHub skill install documented. Use `gh skill` row below for GitHub-based install. | -| **OpenCode** | Manual folder copy | `.opencode/skills//`
`~/.config/opencode/skills//`
Also compatible:
`.claude/skills//`
`~/.claude/skills//`
`.agents/skills//`
`~/.agents/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). | -| **Antigravity** | Manual folder copy | `.agents/skills//`
`~/.gemini/antigravity/skills//` | Defaults to `.agents/skills`. Copy full folder. | -| **Windsurf** | Manual folder copy | `.windsurf/skills//`
`~/.codeium/windsurf/skills//`
Enterprise: macOS `/Library/Application Support/Windsurf/skills/`, Linux/WSL `/etc/windsurf/skills/`, Windows `C:\ProgramData\Windsurf\skills\` | Each skill is a subdirectory containing `SKILL.md`. Copy full folder. | -| **GitHub Copilot CLI** | Manual folder copy | Project: `.github/skills//`, `.claude/skills//`, `.agents/skills//`
Personal: `~/.copilot/skills//`, `~/.claude/skills//`, `~/.agents/skills//` | Clone/download from `codeocean/skills`, copy the full folder at ``. | -| **VS Code / Copilot agent plugins** | Plugin from Git source | Run `Chat: Install Plugin From Source` → enter Git repo URL | This is for **plugins**, not raw skill folders. Applies only if the skill is wrapped as a plugin. Does not apply to raw skill repos like `codeocean/skills`. | -| **Gemini CLI** | Native GitHub install | `gemini skills install https://github.com/codeocean/skills.git --path `
`gemini skills install /path/to/local/ --scope workspace`
`gemini skills link /path/to/local/ --scope workspace` | Supports Git repo, local dir, zipped `.skill`, monorepo subpath, workspace/user scope. Use `--path` for monorepo subpath. | -| **Cline** | Manual folder copy | `.cline/skills//`
`~/.cline/skills//` | Enable Skills in Settings → Features → Enable Skills. Experimental. | -| **Kiro IDE** | Native GitHub import | Agent Steering & Skills → `+` → Import a skill → GitHub → paste `https://github.com/codeocean/skills/tree/main/` | URL must point to the subdirectory, not the repo root. Imported skills are copied into the skills directory. | -| **Kiro CLI** | Manual folder copy | `.kiro/skills//`
`~/.kiro/skills//` | Default agent auto-loads skills. Custom agents need `skill://` resources configured. | -| **`gh skill` (GitHub CLI)** | Universal GitHub install | `gh skill install codeocean/skills `
`gh skill install codeocean/skills --agent claude-code`
`gh skill install codeocean/skills --agent cursor`
`gh skill install codeocean/skills --agent codex`
`gh skill install codeocean/skills --agent gemini`
`gh skill install codeocean/skills --agent antigravity` | Installs to the correct host directory automatically. Can pin versions/commits. Cleanest cross-agent GitHub-hosted option. | +| Agent | Install method | Exact path / command | Notes | +| ----------------------------------- | ------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Claude Code** | Manual folder copy | Project: `.claude/skills//`
User: `~/.claude/skills//`
Plugin: `/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). Claude watches these directories automatically. | +| **Codex** | A. Manual folder copy
B. Plugin install | A. `$CWD/.agents/skills//`
`$REPO_ROOT/.agents/skills//`
`$HOME/.agents/skills//`
`/etc/codex/skills//`
B. In-app: add from plugin directory
CLI: `/plugins` → Install plugin | Clone/download from `codeocean/skills`, copy the folder at ``. Codex supports skills natively; packaged distribution is often via plugins. | +| **Cursor** | Plugin-first | Install plugin from marketplace / team marketplace | No official direct raw GitHub skill install documented. Use `gh skill` row below for GitHub-based install. | +| **OpenCode** | Manual folder copy | `.opencode/skills//`
`~/.config/opencode/skills//`
Also compatible:
`.claude/skills//`
`~/.claude/skills//`
`.agents/skills//`
`~/.agents/skills//` | Copy full folder from [``](https://github.com/codeocean/skills/tree/main/). | +| **Antigravity** | Manual folder copy | `.agents/skills//`
`~/.gemini/antigravity/skills//` | Defaults to `.agents/skills`. Copy full folder. | +| **Windsurf** | Manual folder copy | `.windsurf/skills//`
`~/.codeium/windsurf/skills//`
Enterprise: macOS `/Library/Application Support/Windsurf/skills/`, Linux/WSL `/etc/windsurf/skills/`, Windows `C:\ProgramData\Windsurf\skills\` | Each skill is a subdirectory containing `SKILL.md`. Copy full folder. | +| **GitHub Copilot CLI** | Manual folder copy | Project: `.github/skills//`, `.claude/skills//`, `.agents/skills//`
Personal: `~/.copilot/skills//`, `~/.claude/skills//`, `~/.agents/skills//` | Clone/download from `codeocean/skills`, copy the full folder at ``. | +| **VS Code / Copilot agent plugins** | Plugin from Git source | Run `Chat: Install Plugin From Source` → enter Git repo URL | This is for **plugins**, not raw skill folders. Applies only if the skill is wrapped as a plugin. Does not apply to raw skill repos like `codeocean/skills`. | +| **Gemini CLI** | Native GitHub install | `gemini skills install https://github.com/codeocean/skills.git --path `
`gemini skills install /path/to/local/ --scope workspace`
`gemini skills link /path/to/local/ --scope workspace` | Supports Git repo, local dir, zipped `.skill`, monorepo subpath, workspace/user scope. Use `--path` for monorepo subpath. | +| **Cline** | Manual folder copy | `.cline/skills//`
`~/.cline/skills//` | Enable Skills in Settings → Features → Enable Skills. Experimental. | +| **Kiro IDE** | Native GitHub import | Agent Steering & Skills → `+` → Import a skill → GitHub → paste `https://github.com/codeocean/skills/tree/main/` | URL must point to the subdirectory, not the repo root. Imported skills are copied into the skills directory. | +| **Kiro CLI** | Manual folder copy | `.kiro/skills//`
`~/.kiro/skills//` | Default agent auto-loads skills. Custom agents need `skill://` resources configured. | +| **`gh skill` (GitHub CLI)** | Universal GitHub install | `gh skill install codeocean/skills `
`gh skill install codeocean/skills --agent claude-code`
`gh skill install codeocean/skills --agent cursor`
`gh skill install codeocean/skills --agent codex`
`gh skill install codeocean/skills --agent gemini`
`gh skill install codeocean/skills --agent antigravity` | Installs to the correct host directory automatically. Can pin versions/commits. Cleanest cross-agent GitHub-hosted option. | #### Shared patterns diff --git a/codeocean/SKILL.md b/codeocean/SKILL.md index 08c9376..f15be8a 100644 --- a/codeocean/SKILL.md +++ b/codeocean/SKILL.md @@ -17,14 +17,14 @@ Code Ocean resources covered by this skill: Three access methods: -| Method | When to use | Reference | -|--------|------------|-----------| -| **MCP Server** (25 tools) | Primary for agentic workflows | [mcp-guide.md](references/mcp-guide.md), [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | -| **Python SDK** (`codeocean`) | Python scripts and typed integrations | [sdk-guide.md](references/sdk-guide.md) | -| **REST API** (curl) | Shell automation and raw HTTP | [cli-guide.md](references/cli-guide.md) | -| **Setup/Auth** | Token generation, compatibility, MCP install | [setup-and-auth.md](references/setup-and-auth.md), [mcp-server-install.md](references/mcp-server-install.md) | -| **Errors** | Interpreting failures and choosing the next action | [errors-http-and-sdk.md](references/errors-http-and-sdk.md), [errors-resource-states.md](references/errors-resource-states.md) | -| **User Guide Concepts** | Short product-level meanings from the user guide | [user-guide/](references/user-guide/) | +| Method | When to use | Reference | +| ---------------------------- | -------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ | +| **MCP Server** (25 tools) | Primary for agentic workflows | [mcp-guide.md](references/mcp-guide.md), [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | +| **Python SDK** (`codeocean`) | Python scripts and typed integrations | [sdk-guide.md](references/sdk-guide.md) | +| **REST API** (curl) | Shell automation and raw HTTP | [cli-guide.md](references/cli-guide.md) | +| **Setup/Auth** | Token generation, compatibility, MCP install | [setup-and-auth.md](references/setup-and-auth.md), [mcp-server-install.md](references/mcp-server-install.md) | +| **Errors** | Interpreting failures and choosing the next action | [errors-http-and-sdk.md](references/errors-http-and-sdk.md), [errors-resource-states.md](references/errors-resource-states.md) | +| **User Guide Concepts** | Short product-level meanings from the user guide | [user-guide/](references/user-guide/) | ## 2. Core Workflows @@ -43,9 +43,9 @@ Minimal run payload: ```json { "capsule_id": "", - "data_assets": [{"id": "", "mount": ""}], + "data_assets": [{ "id": "", "mount": "" }], "parameters": ["value1", "value2"], - "named_parameters": [{"param_name": "threshold", "value": "0.5"}] + "named_parameters": [{ "param_name": "threshold", "value": "0.5" }] } ``` @@ -72,10 +72,13 @@ Pipeline run payload: ```json { "pipeline_id": "", - "data_assets": [{"id": "", "mount": ""}], + "data_assets": [{ "id": "", "mount": "" }], "processes": [ - {"name": "step1", "parameters": ["val1"]}, - {"name": "step2", "named_parameters": [{"param_name": "k", "value": "v"}]} + { "name": "step1", "parameters": ["val1"] }, + { + "name": "step2", + "named_parameters": [{ "param_name": "k", "value": "v" }] + } ] } ``` @@ -89,9 +92,9 @@ Pipeline run payload: ### Workflow 5: Attach and Detach Data Assets -| Context | Attach | Detach | -|---------|--------|--------| -| Capsule | `attach_data_assets(capsule_id, attach_params=[...])` | `detach_data_assets(capsule_id, data_assets=[...])` | +| Context | Attach | Detach | +| ----------------------------- | --------------------------------------------------------------------- | ------------------------------------------------------------------- | +| Capsule | `attach_data_assets(capsule_id, attach_params=[...])` | `detach_data_assets(capsule_id, data_assets=[...])` | | Cloud workstation computation | `attach_computation_data_assets(computation_id, attach_params=[...])` | `detach_computation_data_assets(computation_id, data_assets=[...])` | Attach expects objects like `{id, mount?}`. Detach expects plain ID strings. @@ -168,29 +171,29 @@ When they disagree, prefer the local SDK/MCP source for current callable names a ## 4. Reference Index -| File | Load when... | -|------|-------------| -| [mcp-guide.md](references/mcp-guide.md) | MCP workflows and caveats | -| [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | Exact MCP tool names and params | -| [sdk-guide.md](references/sdk-guide.md) | Python SDK setup, imports, examples | -| [cli-guide.md](references/cli-guide.md) | curl routes, methods, and payloads | -| [capsules.md](references/capsules.md) | Capsule model, search, app panel, permissions | -| [pipelines.md](references/pipelines.md) | Pipeline-specific SDK/REST guidance | -| [computations.md](references/computations.md) | Runs, polling, results, cloud workstations | -| [data-assets.md](references/data-assets.md) | Data asset model and lifecycle | -| [search-and-pagination.md](references/search-and-pagination.md) | Query syntax and pagination behavior | -| [permissions.md](references/permissions.md) | Permissions model and routes | -| [custom-metadata.md](references/custom-metadata.md) | Custom metadata schema | -| [errors-http-and-sdk.md](references/errors-http-and-sdk.md) | HTTP status codes, SDK `Error`, retry meaning | -| [errors-resource-states.md](references/errors-resource-states.md) | Failed computations/data assets and how to read them | -| [setup-and-auth.md](references/setup-and-auth.md) | Tokens, env vars, compatibility | -| [mcp-server-install.md](references/mcp-server-install.md) | MCP server installation | -| [user-guide/capsule.md](references/user-guide/capsule.md) | Product-level definition of a Capsule | -| [user-guide/pipeline.md](references/user-guide/pipeline.md) | Product-level definition of a Pipeline | -| [user-guide/data-asset.md](references/user-guide/data-asset.md) | Product-level definition of a Data Asset | -| [user-guide/computation.md](references/user-guide/computation.md) | Product-level definition of a Computation | -| [user-guide/cloud-workstation.md](references/user-guide/cloud-workstation.md) | Product-level definition of a Cloud Workstation | -| [user-guide/reproducible-run.md](references/user-guide/reproducible-run.md) | Product-level definition of a Reproducible Run | +| File | Load when... | +| ----------------------------------------------------------------------------- | ---------------------------------------------------- | +| [mcp-guide.md](references/mcp-guide.md) | MCP workflows and caveats | +| [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | Exact MCP tool names and params | +| [sdk-guide.md](references/sdk-guide.md) | Python SDK setup, imports, examples | +| [cli-guide.md](references/cli-guide.md) | curl routes, methods, and payloads | +| [capsules.md](references/capsules.md) | Capsule model, search, app panel, permissions | +| [pipelines.md](references/pipelines.md) | Pipeline-specific SDK/REST guidance | +| [computations.md](references/computations.md) | Runs, polling, results, cloud workstations | +| [data-assets.md](references/data-assets.md) | Data asset model and lifecycle | +| [search-and-pagination.md](references/search-and-pagination.md) | Query syntax and pagination behavior | +| [permissions.md](references/permissions.md) | Permissions model and routes | +| [custom-metadata.md](references/custom-metadata.md) | Custom metadata schema | +| [errors-http-and-sdk.md](references/errors-http-and-sdk.md) | HTTP status codes, SDK `Error`, retry meaning | +| [errors-resource-states.md](references/errors-resource-states.md) | Failed computations/data assets and how to read them | +| [setup-and-auth.md](references/setup-and-auth.md) | Tokens, env vars, compatibility | +| [mcp-server-install.md](references/mcp-server-install.md) | MCP server installation | +| [user-guide/capsule.md](references/user-guide/capsule.md) | Product-level definition of a Capsule | +| [user-guide/pipeline.md](references/user-guide/pipeline.md) | Product-level definition of a Pipeline | +| [user-guide/data-asset.md](references/user-guide/data-asset.md) | Product-level definition of a Data Asset | +| [user-guide/computation.md](references/user-guide/computation.md) | Product-level definition of a Computation | +| [user-guide/cloud-workstation.md](references/user-guide/cloud-workstation.md) | Product-level definition of a Cloud Workstation | +| [user-guide/reproducible-run.md](references/user-guide/reproducible-run.md) | Product-level definition of a Reproducible Run | ## 5. External Links diff --git a/codeocean/references/permissions.md b/codeocean/references/permissions.md index 9346862..6588160 100644 --- a/codeocean/references/permissions.md +++ b/codeocean/references/permissions.md @@ -85,8 +85,8 @@ Equivalent JSON: ```json { - "users": [{"email": "user@example.com", "role": "editor"}], - "groups": [{"group": "research-team", "role": "viewer"}], + "users": [{ "email": "user@example.com", "role": "editor" }], + "groups": [{ "group": "research-team", "role": "viewer" }], "everyone": "discoverable", "share_assets": true } diff --git a/codeocean/references/pipelines.md b/codeocean/references/pipelines.md index b965057..3157133 100644 --- a/codeocean/references/pipelines.md +++ b/codeocean/references/pipelines.md @@ -22,16 +22,16 @@ These methods route through `/pipelines/...` via a `Capsules(..., _route="pipeli ## SDK and REST Operations -| Operation | SDK | REST | -|---|---|---| -| Search | `client.pipelines.search_pipelines(...)` | `POST /pipelines/search` | -| Get | `client.pipelines.get_pipeline(id)` | `GET /pipelines/{id}` | -| App panel | `client.pipelines.get_pipeline_app_panel(id)` | `GET /pipelines/{id}/app_panel` | -| List computations | `client.pipelines.list_computations(id)` | `GET /pipelines/{id}/computations` | -| Permissions get/update | `get_permissions` / `update_permissions` | `GET` / `POST /pipelines/{id}/permissions` | -| Attach/detach data assets | `attach_data_assets` / `detach_data_assets` | `POST` / `DELETE /pipelines/{id}/data_assets` | -| Archive | `archive_pipeline` | `PATCH /pipelines/{id}/archive?archive=true` | -| Delete | `delete_pipeline` | `DELETE /pipelines/{id}` | +| Operation | SDK | REST | +| ------------------------- | --------------------------------------------- | --------------------------------------------- | +| Search | `client.pipelines.search_pipelines(...)` | `POST /pipelines/search` | +| Get | `client.pipelines.get_pipeline(id)` | `GET /pipelines/{id}` | +| App panel | `client.pipelines.get_pipeline_app_panel(id)` | `GET /pipelines/{id}/app_panel` | +| List computations | `client.pipelines.list_computations(id)` | `GET /pipelines/{id}/computations` | +| Permissions get/update | `get_permissions` / `update_permissions` | `GET` / `POST /pipelines/{id}/permissions` | +| Attach/detach data assets | `attach_data_assets` / `detach_data_assets` | `POST` / `DELETE /pipelines/{id}/data_assets` | +| Archive | `archive_pipeline` | `PATCH /pipelines/{id}/archive?archive=true` | +| Delete | `delete_pipeline` | `DELETE /pipelines/{id}` | ## MCP Caveat From 5c0a96ceabad5861eed80cb6a64168596ab81994 Mon Sep 17 00:00:00 2001 From: Dror Hilman Date: Wed, 22 Apr 2026 14:43:33 +0200 Subject: [PATCH 3/3] Refactor documentation across multiple references for clarity and accuracy - Updated capsules.md to streamline the data model section and clarify search parameters. - Revised computations.md to enhance explanations of running computations and handling results. - Simplified custom-metadata.md to focus on deployment-defined schemas and usage examples. - Clarified data-assets.md by removing redundant data model details and emphasizing query syntax. - Enhanced errors-http-and-sdk.md to provide clearer descriptions of SDK error handling. - Improved errors-resource-states.md by summarizing key fields for computation and data asset failures. - Updated mcp-guide.md to emphasize the importance of live schemas for tool parameters. - Revised mcp-tools-catalog.md to clarify tool groupings and behavioral notes. - Streamlined permissions.md to focus on available permissions in SDK and REST API. - Updated pipelines.md to clarify the routing of pipeline operations through the SDK. - Refined sdk-guide.md to emphasize the importance of inspecting the installed SDK version. - Enhanced search-and-pagination.md to clarify query syntax and structured search parameters. - Updated setup-and-auth.md to address compatibility discrepancies between SDK and MCP versions. --- codeocean/SKILL.md | 110 ++++----- codeocean/references/capsules.md | 88 +------ codeocean/references/computations.md | 73 +----- codeocean/references/custom-metadata.md | 48 +--- codeocean/references/data-assets.md | 119 +--------- codeocean/references/errors-http-and-sdk.md | 9 +- .../references/errors-resource-states.md | 23 +- codeocean/references/mcp-guide.md | 12 +- codeocean/references/mcp-tools-catalog.md | 219 +++--------------- codeocean/references/permissions.md | 35 +-- codeocean/references/pipelines.md | 37 +-- codeocean/references/sdk-guide.md | 106 ++------- codeocean/references/search-and-pagination.md | 122 ++-------- codeocean/references/setup-and-auth.md | 9 +- 14 files changed, 153 insertions(+), 857 deletions(-) diff --git a/codeocean/SKILL.md b/codeocean/SKILL.md index f15be8a..8078710 100644 --- a/codeocean/SKILL.md +++ b/codeocean/SKILL.md @@ -103,44 +103,15 @@ Attach expects objects like `{id, mount?}`. Detach expects plain ID strings. ### RunParams -`RunParams` fields in the local SDK are: +Key fields: `capsule_id` or `pipeline_id`, plus `data_assets`, `parameters`, `named_parameters`, and `processes` (for pipelines). Verify exact fields against the installed SDK version or live MCP tool schema. -- `capsule_id` -- `pipeline_id` -- `version` -- `resume_run_id` -- `nextflow_profile` -- `data_assets` -- `parameters` -- `named_parameters` -- `processes` +### Search -### Search Fields - -`query` supports free text plus `field:value` filters. - -- Capsule query fields: `id`, `name`, `doi`, `tag`, `field`, `affiliation`, `journal`, `article`, `author` -- Data asset query fields: `name`, `tag`, `run_script`, `commit_id`, `contained_data_id` - -Structured filters are separate from `query`: - -- Capsule: `ownership`, `status`, `favorite`, `archived`, `sort_field`, `sort_order`, `filters` -- Data asset: `type`, `origin`, `ownership`, `favorite`, `archived`, `sort_field`, `sort_order`, `filters` +`query` supports free text plus `field:value` filters. Structured filters like `type`, `ownership`, `origin` are separate params — do not put them inside the `query` string. See [search-and-pagination.md](references/search-and-pagination.md) for patterns. ### MCP Compact Search Format -- Capsules/pipelines: `id`, `n`, `s`, `d`, `t` -- Data assets: `id`, `n`, `d`, `t` - -Response envelope fields: - -- `items` -- `has_more` -- `next_token` -- `item_count` -- optional `field_names` - -Descriptions are truncated to 200 characters. Tags are limited to 10 entries. +MCP search responses use abbreviated field names (`n`=name, `d`=description, `t`=tags, `s`=slug). Descriptions are truncated to 200 characters. Tags are limited to 10 entries. Pass `include_field_names=true` to get the mapping. ### File Reading Limit @@ -160,43 +131,50 @@ When an agent sees an error, it should first classify it: ### Compatibility Note -The public docs and local repos are not perfectly aligned: - -- Public Python SDK docs currently say Python `>=3.11` and Code Ocean `>=2.19`. -- Local SDK `pyproject.toml` declares Python `>=3.9`. -- Local SDK client sends `Min-Server-Version: 4.3.0`. -- Local MCP package requires Python `>=3.10` and depends on `codeocean>=0.14.0,<0.15.0`. - -When they disagree, prefer the local SDK/MCP source for current callable names and payload shapes, and mention the public-doc mismatch explicitly. +Public docs, the installed SDK, and the MCP server may target different Code Ocean versions. When they disagree, prefer the locally installed SDK/MCP source for current callable names and payload shapes, and mention the mismatch explicitly. Check `pip show codeocean` and the MCP server's live tool schemas for the ground truth. ## 4. Reference Index -| File | Load when... | -| ----------------------------------------------------------------------------- | ---------------------------------------------------- | -| [mcp-guide.md](references/mcp-guide.md) | MCP workflows and caveats | -| [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | Exact MCP tool names and params | -| [sdk-guide.md](references/sdk-guide.md) | Python SDK setup, imports, examples | -| [cli-guide.md](references/cli-guide.md) | curl routes, methods, and payloads | -| [capsules.md](references/capsules.md) | Capsule model, search, app panel, permissions | -| [pipelines.md](references/pipelines.md) | Pipeline-specific SDK/REST guidance | -| [computations.md](references/computations.md) | Runs, polling, results, cloud workstations | -| [data-assets.md](references/data-assets.md) | Data asset model and lifecycle | -| [search-and-pagination.md](references/search-and-pagination.md) | Query syntax and pagination behavior | -| [permissions.md](references/permissions.md) | Permissions model and routes | -| [custom-metadata.md](references/custom-metadata.md) | Custom metadata schema | -| [errors-http-and-sdk.md](references/errors-http-and-sdk.md) | HTTP status codes, SDK `Error`, retry meaning | -| [errors-resource-states.md](references/errors-resource-states.md) | Failed computations/data assets and how to read them | -| [setup-and-auth.md](references/setup-and-auth.md) | Tokens, env vars, compatibility | -| [mcp-server-install.md](references/mcp-server-install.md) | MCP server installation | -| [user-guide/capsule.md](references/user-guide/capsule.md) | Product-level definition of a Capsule | -| [user-guide/pipeline.md](references/user-guide/pipeline.md) | Product-level definition of a Pipeline | -| [user-guide/data-asset.md](references/user-guide/data-asset.md) | Product-level definition of a Data Asset | -| [user-guide/computation.md](references/user-guide/computation.md) | Product-level definition of a Computation | -| [user-guide/cloud-workstation.md](references/user-guide/cloud-workstation.md) | Product-level definition of a Cloud Workstation | -| [user-guide/reproducible-run.md](references/user-guide/reproducible-run.md) | Product-level definition of a Reproducible Run | - -## 5. External Links +| File | Load when... | +| ----------------------------------------------------------------------------- | ----------------------------------------------------------- | +| [mcp-guide.md](references/mcp-guide.md) | MCP workflow patterns and caveats | +| [mcp-tools-catalog.md](references/mcp-tools-catalog.md) | MCP tool grouping and anti-patterns (live schemas are truth)| +| [sdk-guide.md](references/sdk-guide.md) | Python SDK setup, patterns, and examples | +| [cli-guide.md](references/cli-guide.md) | curl route patterns and payloads | +| [capsules.md](references/capsules.md) | Capsule workflows, search, app panel, permissions | +| [pipelines.md](references/pipelines.md) | Pipeline-specific SDK/REST guidance | +| [computations.md](references/computations.md) | Run workflows, polling, results, cloud workstations | +| [data-assets.md](references/data-assets.md) | Data asset workflows and lifecycle | +| [search-and-pagination.md](references/search-and-pagination.md) | Query syntax and pagination patterns | +| [permissions.md](references/permissions.md) | Permissions patterns and routes | +| [custom-metadata.md](references/custom-metadata.md) | Custom metadata usage patterns | +| [errors-http-and-sdk.md](references/errors-http-and-sdk.md) | HTTP status codes, SDK `Error`, retry meaning | +| [errors-resource-states.md](references/errors-resource-states.md) | Failed computations/data assets and how to read them | +| [setup-and-auth.md](references/setup-and-auth.md) | Tokens, env vars, compatibility | +| [mcp-server-install.md](references/mcp-server-install.md) | MCP server installation | +| [user-guide/capsule.md](references/user-guide/capsule.md) | Product-level definition of a Capsule | +| [user-guide/pipeline.md](references/user-guide/pipeline.md) | Product-level definition of a Pipeline | +| [user-guide/data-asset.md](references/user-guide/data-asset.md) | Product-level definition of a Data Asset | +| [user-guide/computation.md](references/user-guide/computation.md) | Product-level definition of a Computation | +| [user-guide/cloud-workstation.md](references/user-guide/cloud-workstation.md) | Product-level definition of a Cloud Workstation | +| [user-guide/reproducible-run.md](references/user-guide/reproducible-run.md) | Product-level definition of a Reproducible Run | + +## 5. Version Awareness + +Code Ocean is deployed at different versions across customer environments. The data model (field names, enum values, method signatures) can change between releases. + +**Rules for agents:** + +- **MCP tools**: tool schemas come from the running MCP server at connection time. Use those live schemas as the source of truth for tool names, parameter names, and types. Do not rely on hardcoded field lists in this skill. +- **Python SDK**: if you need to verify a model's fields or an import path, inspect the installed SDK source rather than trusting this skill's examples. Run `python -c "import codeocean; print(codeocean.__version__)"` to check the version, and read the SDK model files directly if needed. +- **REST API**: route structure (`/api/v1/...`) and auth patterns are stable. For exact field-level details, consult the user guide for the customer's deployed version. +- **User Guide**: the canonical reference for the customer's version is always [docs.codeocean.com/user-guide](https://docs.codeocean.com/user-guide). When in doubt about a field, enum, or behavior, check there first. + +This skill provides **workflow patterns, error interpretation, gotchas, and integration guidance** — not a substitute for version-specific API documentation. + +## 6. External Links - [Code Ocean User Guide](https://docs.codeocean.com/user-guide) +- [Code Ocean API Reference](https://docs.codeocean.com/user-guide/code-ocean-api) - [Code Ocean MCP Server](https://github.com/codeocean/codeocean-mcp-server) - [Code Ocean Python SDK](https://github.com/codeocean/codeocean-sdk-python) diff --git a/codeocean/references/capsules.md b/codeocean/references/capsules.md index 0a67892..662cae5 100644 --- a/codeocean/references/capsules.md +++ b/codeocean/references/capsules.md @@ -1,26 +1,6 @@ # Capsules Reference -## Data Model - -`Capsule` fields in the local SDK model include: - -- `id` -- `created` -- `name` -- `status` -- `owner` -- `slug` -- `owner_email` -- `last_accessed` -- `article` -- `cloned_from_url` -- `description` -- `field` -- `tags` -- `original_capsule` -- `release_capsule` -- `submission` -- `versions` +For the full field-level Capsule data model, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/capsule) or inspect the installed SDK source. This file covers workflow patterns and gotchas. ## Search @@ -40,27 +20,7 @@ results = client.capsules.search_capsules( ) ``` -Query fields from `CapsuleSearchParams`: - -- `id` -- `name` -- `doi` -- `tag` -- `field` -- `affiliation` -- `journal` -- `article` -- `author` - -Structured search params: - -- `sort_field`: `created`, `last_accessed`, `name` -- `sort_order`: `asc`, `desc` -- `ownership`: `private`, `created`, `shared` -- `status`: `release`, `non_release` -- `favorite` -- `archived` -- `filters` +Query supports `field:value` syntax (e.g., `name:`, `tag:`, `doi:`). Structured params like `ownership`, `status`, `favorite`, `archived`, `sort_field`, `sort_order`, and `filters` are separate from `query` — do not embed them in the query string. ## Get Capsule @@ -96,28 +56,7 @@ REST: curl -u "$TOKEN:" "$DOMAIN/api/v1/capsules/$CAPSULE_ID/app_panel" ``` -Relevant app-panel model sections: - -- `general` -- `data_assets` -- `categories` -- `parameters` -- `results` -- `processes` - -Relevant nested classes from the SDK: - -- `AppPanelGeneral` -- `AppPanelDataAsset` -- `AppPanelCategories` -- `AppPanelParameters` -- `AppPanelResult` -- `AppPanelProcess` - -App-panel enums: - -- `AppPanelDataAssetKind`: `internal`, `external`, `combined` -- `AppPanelParameterType`: `text`, `list`, `file` +The app panel describes the capsule's runnable interface: general info, data asset slots, parameter definitions, and result structure. Always check the app panel before running a capsule to understand required inputs. ## Computations for a Capsule @@ -141,7 +80,7 @@ list_computations(capsule_id) ## Attach and Detach Data Assets -Attach shape: +Attach expects objects with `id` (and optional `mount`). Detach expects plain ID strings. ```python from codeocean.data_asset import DataAssetAttachParams @@ -152,8 +91,6 @@ client.capsules.attach_data_assets( ) ``` -Detach shape: - ```python client.capsules.detach_data_assets(capsule_id, ["data-asset-uuid"]) ``` @@ -167,19 +104,6 @@ REST routes: Permissions are SDK/REST only, not MCP. -Imports: - -```python -from codeocean.models.components import ( - Permissions, - UserPermissions, - GroupPermissions, - UserRole, - GroupRole, - EveryoneRole, -) -``` - SDK methods: - `client.capsules.get_permissions(capsule_id)` @@ -190,6 +114,8 @@ REST routes: - `GET /capsules/{id}/permissions` - `POST /capsules/{id}/permissions` +For permission model types and roles, see [permissions.md](permissions.md). + ## Archive and Delete SDK: @@ -201,5 +127,3 @@ REST: - `PATCH /capsules/{id}/archive?archive=true` - `DELETE /capsules/{id}` - -The local SDK/doc sources confirm the methods and routes above. They do not establish an additional delete precondition beyond normal API permissions. diff --git a/codeocean/references/computations.md b/codeocean/references/computations.md index e177106..14159ec 100644 --- a/codeocean/references/computations.md +++ b/codeocean/references/computations.md @@ -1,49 +1,6 @@ # Computations Reference -## Data Model - -`Computation` fields: - -- `id` -- `created` -- `name` -- `owner` -- `run_time` -- `state` -- `owner_email` -- `cloud_workstation` -- `data_assets` -- `parameters` -- `nextflow_profile` -- `processes` -- `end_status` -- `exit_code` -- `has_results` - -Enums: - -- `ComputationState`: `initializing`, `running`, `finalizing`, `completed`, `failed` -- `ComputationEndStatus`: `succeeded`, `failed`, `stopped` - -## RunParams - -`RunParams` fields in the local SDK: - -- `capsule_id` -- `pipeline_id` -- `version` -- `resume_run_id` -- `nextflow_profile` -- `data_assets` -- `parameters` -- `named_parameters` -- `processes` - -Nested run models: - -- `DataAssetsRunParam`: `id`, `mount` -- `NamedRunParam`: `param_name`, `value` -- `PipelineProcessParams`: `name`, `parameters`, `named_parameters` +For the full field-level Computation data model, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/computation) or inspect the installed SDK source. This file covers workflow patterns and gotchas. ## Running @@ -71,7 +28,7 @@ computation = client.computations.run_capsule( ) ``` -Pipeline runs also use `run_capsule(...)`, with `pipeline_id`. +Pipeline runs also use `run_capsule(...)`, with `pipeline_id` instead of `capsule_id`. ## Waiting for Completion @@ -91,15 +48,9 @@ completed = client.computations.wait_until_completed( ) ``` -SDK note: +**Gotcha — SDK vs MCP difference**: the SDK method takes a full `Computation` object; the MCP tool takes a `computation_id` (it internally fetches the object first). Minimum polling interval is `5`. -- the method takes a full `Computation` object -- minimum polling interval is `5` - -MCP note: - -- the tool takes `computation_id` -- it internally calls `get_computation(computation_id)` first, then the SDK polling method +**Gotcha — terminal does not mean successful**: after polling, always inspect `state`, `end_status`, and `exit_code`. See [errors-resource-states.md](errors-resource-states.md) for interpretation. ## Result Files @@ -119,21 +70,19 @@ Get file URLs: Read file content: - MCP: `download_and_read_a_file_from_computation(computation_id, file_path)` -- Current helper behavior: reads and decodes the first `50_000` bytes +- Reads and decodes the first `50_000` bytes only ## Rename and Delete -- SDK: `rename_computation(computation_id, name)` -- SDK: `delete_computation(computation_id)` -- REST: `PATCH /computations/{id}?name=...` -- REST: `DELETE /computations/{id}` +- SDK: `rename_computation(computation_id, name)`, `delete_computation(computation_id)` +- REST: `PATCH /computations/{id}?name=...`, `DELETE /computations/{id}` ## Cloud Workstation Data Assets Use computation-level attach/detach APIs for cloud workstation sessions: -- SDK: `client.computations.attach_data_assets(computation_id, attach_params)` -- SDK: `client.computations.detach_data_assets(computation_id, data_assets)` -- REST: `POST /computations/{id}/data_assets` -- REST: `DELETE /computations/{id}/data_assets` +- SDK: `client.computations.attach_data_assets(computation_id, attach_params)`, `client.computations.detach_data_assets(computation_id, data_assets)` +- REST: `POST /computations/{id}/data_assets`, `DELETE /computations/{id}/data_assets` - MCP: `attach_computation_data_assets(...)`, `detach_computation_data_assets(...)` + +**Gotcha**: attach expects objects with `{id, mount?}`. Detach expects plain ID strings. diff --git a/codeocean/references/custom-metadata.md b/codeocean/references/custom-metadata.md index 5191881..bfb2d77 100644 --- a/codeocean/references/custom-metadata.md +++ b/codeocean/references/custom-metadata.md @@ -1,40 +1,6 @@ # Custom Metadata Reference -## Schema Types - -Types from [`custom_metadata.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/custom_metadata.py:1): - -- `CustomMetadata` -- `CustomMetadataField` -- `CustomMetadataFieldType` -- `CustomMetadataFieldRange` - -`CustomMetadataFieldType` values: - -- `string` -- `number` -- `date` - -`CustomMetadataField` fields: - -- `name` -- `type` -- `range` -- `allowed_values` -- `multiple` -- `units` -- `category` -- `required` - -`allowed_values` can be either: - -- `list[str]` -- `list[float]` - -`CustomMetadata` fields: - -- `fields` -- `categories` +Custom metadata is a deployment-defined schema that governs which metadata fields are available on data assets. For exact type definitions and field names, consult the installed SDK source or the [user guide](https://docs.codeocean.com/user-guide). ## Get Schema @@ -56,19 +22,11 @@ REST: curl -u "$TOKEN:" "$DOMAIN/api/v1/custom_metadata" ``` -Route: `GET /custom_metadata` +Always fetch the schema first to understand what metadata fields are available in the current deployment. ## Using Custom Metadata on Data Assets -Custom metadata values are supplied inside the `custom_metadata` dict when creating or updating a data asset. - -Examples: - -- string field: `"species": "mouse"` -- multi-string field: `"species": ["mouse", "rat"]` -- number field: `"sample_count": 42` -- multi-number field: `"thresholds": [0.1, 0.2]` -- date field: `"experiment_date": 1700000000` +Custom metadata values are supplied inside the `custom_metadata` dict when creating or updating a data asset. Value types match the field definitions from the schema. Create example: diff --git a/codeocean/references/data-assets.md b/codeocean/references/data-assets.md index 51541e5..416bb51 100644 --- a/codeocean/references/data-assets.md +++ b/codeocean/references/data-assets.md @@ -1,44 +1,6 @@ # Data Assets Reference -## Data Model - -`DataAsset` fields in the local SDK: - -- `id` -- `created` -- `name` -- `mount` -- `last_used` -- `owner` -- `state` -- `type` -- `files` -- `size` -- `description` -- `owner_email` -- `tags` -- `provenance` -- `source_bucket` -- `custom_metadata` -- `app_parameters` -- `nextflow_profile` -- `contained_data_assets` -- `last_transferred` -- `transfer_error` -- `failure_reason` - -Enums: - -- `DataAssetState`: `draft`, `ready`, `failed` -- `DataAssetType`: `dataset`, `result`, `combined`, `model` -- `DataAssetSearchOrigin`: `internal`, `external` - -Important nested models: - -- `Provenance` -- `SourceBucket` -- `ResultsInfo` -- `ContainedDataAsset` +For the full field-level DataAsset data model, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/data-asset) or inspect the installed SDK source. This file covers workflow patterns and gotchas. ## Search @@ -58,31 +20,7 @@ results = client.data_assets.search_data_assets( ) ``` -Query fields: - -- `name` -- `tag` -- `run_script` -- `commit_id` -- `contained_data_id` - -Structured params: - -- `type` -- `origin` -- `ownership` -- `favorite` -- `archived` -- `sort_field` -- `sort_order` -- `filters` - -Sort fields: - -- `created` -- `type` -- `name` -- `size` +Query supports `field:value` syntax (e.g., `name:`, `tag:`). The `type`, `origin`, and `ownership` filters are structured params — do not embed them in the query string. ## Files and URLs @@ -108,36 +46,11 @@ Get file URLs: Read file content: - MCP: `download_and_read_a_file_from_data_asset(data_asset_id, file_path)` -- Current helper behavior: reads and decodes the first `50_000` bytes +- Reads and decodes the first `50_000` bytes only — use `get_data_asset_file_urls` when you need the complete file ## Creating Data Assets -`DataAssetParams` fields: - -- `name` -- `tags` -- `mount` -- `description` -- `source` -- `target` -- `custom_metadata` -- `data_asset_ids` -- `results_info` - -The local SDK model requires `name`, `tags`, and `mount`. - -Source models: - -- `Source` -- `AWSS3Source` -- `GCPCloudStorageSource` -- `ComputationSource` -- `CloudWorkstationSource` - -Target models: - -- `Target` -- `AWSS3Target` +The SDK requires `name`, `tags`, and `mount` at minimum. The `source` field specifies where data comes from (computation results, S3, GCS, cloud workstation). For exact field names, check the installed SDK version. Example from computation results: @@ -183,26 +96,15 @@ MCP: wait_until_ready(data_asset_object, polling_interval=5, timeout=None) ``` -Notes: - -- both SDK and MCP require the full `DataAsset` object -- minimum polling interval is `5` +**Gotcha**: both SDK and MCP require the full `DataAsset` object, not just the ID. Minimum polling interval is `5`. -## Update, Permissions, Archive, Delete - -Update metadata: +## Update Metadata - SDK: `client.data_assets.update_metadata(data_asset_id, update_params)` - REST: `PUT /data_assets/{id}` - MCP: `update_metadata(data_asset_id, update_params)` -`DataAssetUpdateParams` fields: - -- `name` -- `description` -- `tags` -- `mount` -- `custom_metadata` +## Permissions, Archive, Delete Permissions: @@ -216,12 +118,7 @@ Archive/delete: ## Transfer -Admin-only transfer method: +Admin-only: - SDK: `client.data_assets.transfer_data_asset(data_asset_id, transfer_params)` - REST: `POST /data_assets/{id}/transfer` - -`TransferDataParams` fields: - -- `target` -- `force` diff --git a/codeocean/references/errors-http-and-sdk.md b/codeocean/references/errors-http-and-sdk.md index c6dd2b7..4cb12e3 100644 --- a/codeocean/references/errors-http-and-sdk.md +++ b/codeocean/references/errors-http-and-sdk.md @@ -28,14 +28,7 @@ Sources: ## 2. What the Python SDK Raises -The local SDK wraps HTTP failures in `codeocean.Error`. - -From [`error.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/error.py:1): - -- `Error.status_code`: HTTP status code -- `Error.message`: derived from response JSON `message` if present, otherwise raw response text -- `Error.data`: parsed JSON body if the response body is JSON, otherwise `None` -- `Error.http_err`: underlying `requests.HTTPError` +The SDK wraps HTTP failures in `codeocean.Error`. Key attributes include `status_code`, `message`, and `data` (parsed JSON body). Verify available attributes against the installed SDK version. Meaning for agents: diff --git a/codeocean/references/errors-resource-states.md b/codeocean/references/errors-resource-states.md index 6fcb5ed..cd856cb 100644 --- a/codeocean/references/errors-resource-states.md +++ b/codeocean/references/errors-resource-states.md @@ -9,17 +9,7 @@ This is different from HTTP/API errors: ## 1. Computation Failures -Relevant fields from the local SDK `Computation` model: - -- `state` -- `end_status` -- `exit_code` -- `has_results` - -Enums: - -- `state`: `initializing`, `running`, `finalizing`, `completed`, `failed` -- `end_status`: `succeeded`, `failed`, `stopped` +Key fields to inspect: `state`, `end_status`, `exit_code`, `has_results`. For exact enum values, check the installed SDK version or the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/computation). ### How to interpret them @@ -86,16 +76,7 @@ Suggested wording: ## 2. Data Asset Failures -Relevant fields from the local SDK `DataAsset` model: - -- `state` -- `failure_reason` -- `files` -- `size` - -Enum: - -- `state`: `draft`, `ready`, `failed` +Key fields to inspect: `state`, `failure_reason`, `files`, `size`. For exact enum values, check the installed SDK version or the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api/data-asset). ### How to interpret them diff --git a/codeocean/references/mcp-guide.md b/codeocean/references/mcp-guide.md index dd06144..5249073 100644 --- a/codeocean/references/mcp-guide.md +++ b/codeocean/references/mcp-guide.md @@ -2,19 +2,13 @@ ## Overview -The current local MCP server exposes **25** tools: +The MCP server exposes its tool schemas at connection time — use the live schemas as the source of truth for exact tool names and parameters. The tool count and available operations may vary by server version. See [mcp-tools-catalog.md](mcp-tools-catalog.md) for tool groupings and behavioral notes. -- Capsule tools: `search_capsules`, `get_capsule`, `list_computations`, `attach_data_assets`, `detach_data_assets`, `get_capsule_app_panel` -- Pipeline search tool: `search_pipelines` -- Computation tools: `get_computation`, `run_capsule`, `wait_until_completed`, `list_computation_results`, `get_result_file_urls`, `download_and_read_a_file_from_computation`, `rename_computation`, `delete_computation`, `attach_computation_data_assets`, `detach_computation_data_assets` -- Data asset tools: `search_data_assets`, `get_data_asset`, `get_data_asset_file_urls`, `download_and_read_a_file_from_data_asset`, `list_data_asset_files`, `update_metadata`, `wait_until_ready`, `create_data_asset` -- Custom metadata tool: `get_custom_metadata` - -Environment variables from [`server.py`](/Users/drorhilman/codeocean/codeocean-mcp-server/src/codeocean_mcp_server/server.py:1): +Environment variables: - `CODEOCEAN_DOMAIN` - `CODEOCEAN_TOKEN` -- `AGENT_ID` with default `"AI Agent"` +- `AGENT_ID` (optional, defaults to `"AI Agent"`) ## Core Patterns diff --git a/codeocean/references/mcp-tools-catalog.md b/codeocean/references/mcp-tools-catalog.md index 504803f..a405977 100644 --- a/codeocean/references/mcp-tools-catalog.md +++ b/codeocean/references/mcp-tools-catalog.md @@ -1,210 +1,49 @@ # MCP Tools Catalog -Complete catalog of the **25** MCP tools exposed by the current local Code Ocean MCP server. +The MCP server exposes its tool schemas at connection time. **Use the live schemas as the source of truth for exact parameter names and types.** This file documents tool groupings, behavioral notes, and anti-patterns that are not visible from the schemas alone. -## 1. Capsule Search and Management +## Tool Groups -### `search_capsules` +### Capsule Search and Management -- Params: - - `search_params` - - `include_field_names` -- `search_params` fields: `query`, `next_token`, `offset`, `limit`, `sort_field`, `sort_order`, `ownership`, `status`, `favorite`, `archived`, `filters` -- Returns: `{items, has_more, next_token, item_count, field_names?}` +`search_capsules`, `get_capsule`, `list_computations`, `attach_data_assets`, `detach_data_assets`, `get_capsule_app_panel` -### `search_pipelines` +### Pipeline Search -- Params: - - `search_params` - - `include_field_names` -- Uses the same `CapsuleSearchParams` shape as `search_capsules` -- Returns the same compact envelope as `search_capsules` +`search_pipelines` — uses the same search params shape as `search_capsules`. -### `get_capsule` +### Computations -- Params: - - `capsule_id` -- Returns: full `Capsule` -- Current implementation calls `client.capsules.get_capsule(capsule_id)` +`get_computation`, `run_capsule`, `wait_until_completed`, `list_computation_results`, `get_result_file_urls`, `download_and_read_a_file_from_computation`, `rename_computation`, `delete_computation`, `attach_computation_data_assets`, `detach_computation_data_assets` -### `list_computations` +### Data Assets -- Params: - - `capsule_id` -- Returns: `list[Computation]` -- Current implementation calls `client.capsules.list_computations(capsule_id)` +`search_data_assets`, `get_data_asset`, `get_data_asset_file_urls`, `download_and_read_a_file_from_data_asset`, `list_data_asset_files`, `update_metadata`, `wait_until_ready`, `create_data_asset` -### `attach_data_assets` +### Custom Metadata -- Params: - - `capsule_id` - - `attach_params` -- `attach_params`: `list[{id, mount?}]` -- Returns: `list[DataAssetAttachResults]` with fields such as `id`, `mount_state`, `job_id`, `external`, `ready`, `mount` +`get_custom_metadata` -### `detach_data_assets` +## Behavioral Notes -- Params: - - `capsule_id` - - `data_assets` -- `data_assets`: `list[str]` -- Returns: `None` +- `wait_until_completed` takes a `computation_id`, internally fetches the object, then polls via the SDK. +- `wait_until_ready` takes the full `DataAsset` object (not just an ID). +- `download_and_read_*` helpers read the first `50_000` bytes only. +- Attach tools expect objects like `{id, mount?}`. Detach tools expect plain ID strings. -### `get_capsule_app_panel` +## Pipeline Caveat -- Params: - - `capsule_id` - - `version` -- Returns: `AppPanel` -- Current implementation calls `client.capsules.get_capsule_app_panel(...)` +The MCP server has only one pipeline-specific tool: `search_pipelines`. -## 2. Computations +- There is no separate MCP `get_pipeline` or `get_pipeline_app_panel`. +- The capsule-named tools (`get_capsule`, `get_capsule_app_panel`, `list_computations`, `attach_data_assets`, `detach_data_assets`) route through `client.capsules.*`, not `client.pipelines.*`. +- Use SDK or REST when you need guaranteed `/pipelines/...` routing. -### `get_computation` +## Anti-Patterns -- Params: - - `computation_id` -- Returns: `Computation` - -### `run_capsule` - -- Params: - - `run_params` -- `run_params` fields: `capsule_id`, `pipeline_id`, `version`, `resume_run_id`, `nextflow_profile`, `data_assets`, `parameters`, `named_parameters`, `processes` -- Returns: `Computation` - -### `wait_until_completed` - -- Params: - - `computation_id` -- Returns: terminal-state `Computation` -- Wrapper behavior: fetches the computation object first, then passes it to the SDK polling method - -### `list_computation_results` - -- Params: - - `computation_id` -- Returns: `Folder` - -### `get_result_file_urls` - -- Params: - - `computation_id` - - `file_path` -- Returns: `FileURLs {download_url, view_url}` - -### `download_and_read_a_file_from_computation` - -- Params: - - `computation_id` - - `file_path` -- Returns: decoded file content string -- Reads the first `50_000` bytes from the remote response - -### `rename_computation` - -- Params: - - `computation_id` - - `name` -- Returns: `None` - -### `delete_computation` - -- Params: - - `computation_id` -- Returns: `None` - -### `attach_computation_data_assets` - -- Params: - - `computation_id` - - `attach_params` -- `attach_params`: `list[{id, mount?}]` -- Returns: `list[DataAssetAttachResults]` - -### `detach_computation_data_assets` - -- Params: - - `computation_id` - - `data_assets` -- `data_assets`: `list[str]` -- Returns: `None` - -## 3. Data Assets - -### `search_data_assets` - -- Params: - - `search_params` - - `include_field_names` -- `search_params` fields: `query`, `next_token`, `offset`, `limit`, `sort_field`, `sort_order`, `type`, `ownership`, `origin`, `favorite`, `archived`, `filters` -- Returns: `{items, has_more, next_token, item_count, field_names?}` - -### `get_data_asset` - -- Params: - - `data_asset_id` -- Returns: full `DataAsset` - -### `get_data_asset_file_urls` - -- Params: - - `data_asset_id` - - `file_path` -- Returns: `FileURLs {download_url, view_url}` - -### `download_and_read_a_file_from_data_asset` - -- Params: - - `data_asset_id` - - `file_path` -- Returns: decoded file content string -- Reads the first `50_000` bytes from the remote response - -### `list_data_asset_files` - -- Params: - - `data_asset_id` - - `path` -- Returns: `Folder` - -### `update_metadata` - -- Params: - - `data_asset_id` - - `update_params` -- `update_params` fields: `name`, `description`, `tags`, `mount`, `custom_metadata` -- Returns: `DataAsset` - -### `wait_until_ready` - -- Params: - - `data_asset` - - `polling_interval` - - `timeout` -- Returns: terminal-state `DataAsset` -- `data_asset` must be the full object, not an ID string - -### `create_data_asset` - -- Params: - - `data_asset_params` -- `data_asset_params` fields: `name`, `tags`, `mount`, `description`, `source`, `target`, `custom_metadata`, `data_asset_ids`, `results_info` -- Returns: `DataAsset` - -## 4. Custom Metadata - -### `get_custom_metadata` - -- Params: none -- Returns: `CustomMetadata` - -## 5. Pipeline Caveat - -The MCP server has only one pipeline-specific tool name: `search_pipelines`. - -- There is no separate MCP `get_pipeline` -- There is no separate MCP `get_pipeline_app_panel` -- There are no separate MCP pipeline attach/detach tools - -For exact `/pipelines/...` operations, use the SDK or REST guides. +- Calling `run_capsule` then immediately reading results without `wait_until_completed` +- Passing a data asset ID string into `wait_until_ready` — it requires the full object +- Passing plain ID strings into attach tools — attach expects objects +- Treating `type`, `origin`, or `ownership` as data-asset query fields — they are structured search params +- Assuming MCP has full pipeline management parity because it has `search_pipelines` +- Assuming `download_and_read_*` returns whole files — it reads only the first `50_000` bytes diff --git a/codeocean/references/permissions.md b/codeocean/references/permissions.md index 6588160..d677b89 100644 --- a/codeocean/references/permissions.md +++ b/codeocean/references/permissions.md @@ -1,37 +1,6 @@ # Permissions -Permissions are available in the SDK and REST API, not in the MCP server. - -## Model - -Types from [`models/components.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/models/components.py:1): - -- `Permissions` -- `UserPermissions` -- `GroupPermissions` -- `UserRole` -- `GroupRole` -- `EveryoneRole` - -`Permissions` fields: - -- `users` -- `groups` -- `everyone` -- `share_assets` - -Field meanings: - -- `users`: list of `{email, role}` -- `groups`: list of `{group, role}` -- `everyone`: org-wide access level -- `share_assets`: whether related assets are shared too - -Role enums: - -- `UserRole`: `owner`, `editor`, `viewer` -- `GroupRole`: `owner`, `editor`, `viewer`, `discoverable` -- `EveryoneRole`: `viewer`, `discoverable`, `none` +Permissions are available in the SDK and REST API, not in the MCP server. For exact field names and role values, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api) or inspect the installed SDK source. ## SDK Methods @@ -91,3 +60,5 @@ Equivalent JSON: "share_assets": true } ``` + +**Note**: import paths and available role enums may vary by SDK version. Verify against the installed version. diff --git a/codeocean/references/pipelines.md b/codeocean/references/pipelines.md index 3157133..192fe63 100644 --- a/codeocean/references/pipelines.md +++ b/codeocean/references/pipelines.md @@ -1,24 +1,10 @@ # Pipelines Reference -## Overview - -The SDK has a dedicated `Pipelines` client in [`pipeline.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/pipeline.py:1). +For the full pipeline data model and method list, consult the [user guide](https://docs.codeocean.com/user-guide/code-ocean-api) or inspect the installed SDK source. This file covers pipeline-specific workflow patterns and the MCP caveat. -Supported SDK methods: - -- `get_pipeline` -- `delete_pipeline` -- `get_pipeline_app_panel` -- `list_computations` -- `get_permissions` -- `update_permissions` -- `attach_data_assets` -- `detach_data_assets` -- `archive_pipeline` -- `search_pipelines` -- `search_pipelines_iterator` +## Overview -These methods route through `/pipelines/...` via a `Capsules(..., _route="pipelines")` helper. +The SDK has a dedicated `Pipelines` client that mirrors most capsule operations but routes through `/pipelines/...`. ## SDK and REST Operations @@ -46,22 +32,7 @@ The MCP tools named `get_capsule`, `get_capsule_app_panel`, `list_computations`, ## Running Pipelines -Pipelines run through `client.computations.run_capsule(...)` with `pipeline_id`. - -Relevant `RunParams` fields: - -- `pipeline_id` -- `version` -- `resume_run_id` -- `nextflow_profile` -- `data_assets` -- `processes` - -`PipelineProcessParams` fields: - -- `name` -- `parameters` -- `named_parameters` +Pipelines run through `client.computations.run_capsule(...)` with `pipeline_id` instead of `capsule_id`. Per-step configuration uses `processes`. SDK example: diff --git a/codeocean/references/sdk-guide.md b/codeocean/references/sdk-guide.md index 189170a..5f67488 100644 --- a/codeocean/references/sdk-guide.md +++ b/codeocean/references/sdk-guide.md @@ -1,16 +1,14 @@ # Code Ocean Python SDK Guide -## Installation - -Public docs: `pip install -U codeocean` +For exact model fields, enum values, and method signatures, inspect the installed SDK version (`pip show codeocean`) or read the SDK source files directly. This guide covers setup patterns, common imports, and workflow examples. -Compatibility note: +## Installation -- Public docs currently say Python `>=3.11` and Code Ocean `>=2.19` -- Local `pyproject.toml` declares Python `>=3.9` -- Local client code sends `Min-Server-Version: 4.3.0` +```bash +pip install -U codeocean +``` -Document the split instead of collapsing it. +Verify the installed version and check compatibility with your Code Ocean deployment before relying on specific features. ## Client Setup @@ -25,83 +23,36 @@ client = CodeOcean( ) ``` -Client properties from [`client.py`](/Users/drorhilman/codeocean/codeocean-sdk-python/src/codeocean/client.py:1): +The client exposes sub-clients: `client.capsules`, `client.pipelines`, `client.computations`, `client.data_assets`, `client.custom_metadata`. -- `client.capsules` -- `client.pipelines` -- `client.computations` -- `client.data_assets` -- `client.custom_metadata` +The SDK does not auto-read env vars; pass them explicitly. For SDK exception handling and error interpretation, also load [errors-http-and-sdk.md](errors-http-and-sdk.md). -## Imports Reference +## Common Imports ```python from codeocean import CodeOcean -``` - -```python from codeocean.capsule import CapsuleSearchParams -``` - -```python -from codeocean.computation import ( - RunParams, - DataAssetsRunParam, - NamedRunParam, - PipelineProcessParams, -) -``` - -```python +from codeocean.computation import RunParams, DataAssetsRunParam, NamedRunParam, PipelineProcessParams from codeocean.data_asset import ( - DataAssetParams, - DataAssetSearchParams, - DataAssetUpdateParams, - TransferDataParams, - Source, - AWSS3Source, - GCPCloudStorageSource, - ComputationSource, - CloudWorkstationSource, - Target, - AWSS3Target, + DataAssetParams, DataAssetSearchParams, DataAssetUpdateParams, + Source, ComputationSource, AWSS3Source, GCPCloudStorageSource, CloudWorkstationSource, + Target, AWSS3Target, ) -``` - -Permissions imports come from `codeocean.models.components`: - -```python from codeocean.models.components import ( - Permissions, - UserPermissions, - GroupPermissions, - UserRole, - GroupRole, - EveryoneRole, + Permissions, UserPermissions, GroupPermissions, UserRole, GroupRole, EveryoneRole, ) -``` - -SDK error type: - -```python from codeocean import Error ``` -## Pagination - -Search result objects expose: +**Note**: import paths and available classes may vary by SDK version. Verify against the installed version if an import fails. -- `.results` -- `.has_more` -- `.next_token` +## Pagination -Iterator helpers: +Search result objects expose `.results`, `.has_more`, and `.next_token`. -- `client.capsules.search_capsules_iterator(...)` -- `client.pipelines.search_pipelines_iterator(...)` -- `client.data_assets.search_data_assets_iterator(...)` +Iterator helpers are available for all search methods (e.g., `search_capsules_iterator`, `search_data_assets_iterator`). Manual pagination: @@ -123,30 +74,17 @@ while results.has_more: ## Polling -The SDK polling methods take full model objects, not IDs. +The SDK polling methods take full model objects, not IDs. Both enforce a minimum `polling_interval` of `5`. ```python -completed = client.computations.wait_until_completed( - computation, - polling_interval=5, - timeout=300, -) +completed = client.computations.wait_until_completed(computation, polling_interval=5, timeout=300) ``` ```python -ready_asset = client.data_assets.wait_until_ready( - data_asset, - polling_interval=5, - timeout=300, -) +ready_asset = client.data_assets.wait_until_ready(data_asset, polling_interval=5, timeout=300) ``` -Both methods enforce a minimum `polling_interval` of `5`. - -If polling returns a terminal object in a bad state, that is not the same as an HTTP exception. Use [errors-resource-states.md](errors-resource-states.md) to interpret: - -- `Computation.state`, `Computation.end_status`, `Computation.exit_code` -- `DataAsset.state`, `DataAsset.failure_reason` +**Gotcha**: a terminal object is not always successful. After polling, inspect `state`, `end_status`, and `exit_code` (computations) or `state` and `failure_reason` (data assets). See [errors-resource-states.md](errors-resource-states.md). ## Example: Run Capsule, Wait, Create Data Asset diff --git a/codeocean/references/search-and-pagination.md b/codeocean/references/search-and-pagination.md index b55dbca..fa6501f 100644 --- a/codeocean/references/search-and-pagination.md +++ b/codeocean/references/search-and-pagination.md @@ -2,11 +2,9 @@ ## Query Syntax -Free text matches weighted fields. +Free text matches weighted fields. `field:value` filters are also supported. -`field:value` filters are also supported. - -Rules from the local SDK model metadata: +Rules: - same field repeated = OR - different fields = AND @@ -15,96 +13,25 @@ Rules from the local SDK model metadata: - no wildcards - case insensitive -Capsule query fields: - -- `id` -- `name` -- `doi` -- `tag` -- `field` -- `affiliation` -- `journal` -- `article` -- `author` - -Data asset query fields: - -- `name` -- `tag` -- `run_script` -- `commit_id` -- `contained_data_id` - -Important: `type`, `origin`, and `ownership` are structured params, not query fields. - -## CapsuleSearchParams - -Fields: - -- `query` -- `next_token` -- `offset` -- `limit` -- `sort_field` -- `sort_order` -- `ownership` -- `status` -- `favorite` -- `archived` -- `filters` - -Defaults and limits from model metadata: - -- default `limit`: `100` -- max `limit`: `1000` +**Gotcha**: `type`, `origin`, and `ownership` are structured params, not query fields. Do not embed them in the `query` string. -## DataAssetSearchParams +For the list of supported query fields per resource type, check the live MCP tool schemas or the SDK's search param classes. -Fields: +## Structured Search Params -- all pagination fields above -- `type` -- `ownership` -- `origin` -- `favorite` -- `archived` -- `filters` +Both capsule and data asset search accept structured filters like `sort_field`, `sort_order`, `ownership`, `favorite`, `archived`, and `filters`. Data assets additionally support `type` and `origin`. -Data asset sort fields: - -- `created` -- `type` -- `name` -- `size` +Default limit is `100`, max is `1000`. ## SearchFilter -`SearchFilter` fields: - -- `key` -- `value` -- `values` -- `range` -- `exclude` - -`range` uses `SearchFilterRange` with: - -- `min` -- `max` +The `filters` param accepts structured filter objects with `key`, `value`/`values`, `range` (min/max), and `exclude`. Use these for precise filtering beyond free-text queries. ## SDK Pagination -Search result objects expose: - -- `.results` -- `.has_more` -- `.next_token` +Search result objects expose `.results`, `.has_more`, and `.next_token`. -Iterator helpers: - -- `search_capsules_iterator` -- `search_pipelines_iterator` -- `search_data_assets_iterator` +Iterator helpers: `search_capsules_iterator`, `search_pipelines_iterator`, `search_data_assets_iterator`. Manual example: @@ -126,30 +53,13 @@ while results.has_more: ## MCP Compact Search -Search envelope: - -- `items` -- `has_more` -- `next_token` -- `item_count` -- optional `field_names` - -Capsule/pipeline item abbreviations: - -- `id` -- `n` = `name` -- `s` = `slug` -- `d` = `description` -- `t` = `tags` +MCP search responses use abbreviated field names: -Data asset item abbreviations: +- `n` = name, `s` = slug, `d` = description, `t` = tags -- `id` -- `n` = `name` -- `d` = `description` -- `t` = `tags` +Pass `include_field_names=true` to get the full mapping. -Truncation behavior from `search.py`: +Truncation behavior: -- descriptions normalized and truncated to `200` chars with `"...(more)"` -- tags limited to `10` entries, with truncation marker `"..more.."` if needed +- descriptions truncated to `200` chars +- tags limited to `10` entries diff --git a/codeocean/references/setup-and-auth.md b/codeocean/references/setup-and-auth.md index 4af40f7..806c41e 100644 --- a/codeocean/references/setup-and-auth.md +++ b/codeocean/references/setup-and-auth.md @@ -77,11 +77,4 @@ MCP: ## Compatibility -The sources currently diverge: - -- Public Python SDK docs say Python `>=3.11` and Code Ocean `>=2.19` -- Local SDK `pyproject.toml` says Python `>=3.9` -- Local SDK client sets `Min-Server-Version: 4.3.0` -- Local MCP `pyproject.toml` says Python `>=3.10` - -Keep this discrepancy explicit whenever you mention compatibility. +Public docs, the installed SDK, and the MCP server may target different Code Ocean versions. When providing compatibility guidance, check `pip show codeocean` for the installed SDK version and the MCP server's live tool schemas for the actual available tools. Mention any mismatch explicitly rather than assuming one source is authoritative.