Skip to content

Conversation

@keugenek
Copy link
Contributor

Summary

  • Add GitHub Actions workflow for Apps-MCP continuous evaluations
  • Workflow triggers nightly (2am UTC) and on version releases (v* tags)
  • Builds CLI binary from current branch and uploads to UC Volume
  • Runs generation and evaluation jobs from neondatabase/eng-app-devex bundle

Related

Test plan

  • Create apps-mcp-evals environment in repo settings
  • Add DATABRICKS_HOST and DATABRICKS_TOKEN secrets
  • Trigger workflow manually via workflow_dispatch
  • Verify jobs complete successfully
  • Check MLflow experiment at /Shared/apps-mcp-evaluations-staging

🤖 Generated with Claude Code

Add GitHub Actions workflow that:
- Triggers nightly (2am UTC) and on version releases
- Builds CLI binary for Linux
- Uploads binary to UC Volume
- Deploys and runs eval jobs from neondatabase/eng-app-devex

Bundle located at: apps-mcp-evals/ subdirectory

Requires 'apps-mcp-evals' environment with DATABRICKS_HOST and
DATABRICKS_TOKEN secrets.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@keugenek keugenek force-pushed the feature/apps-mcp-evals-cicd branch from add5c11 to acd2449 Compare January 16, 2026 15:38
@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Jan 16, 2026

Commit: acd2449

Run: 21072008716

Env ❌​FAIL 🟨​KNOWN 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 3 14 2 411 695 29:04
🟨​ aws windows 6 11 2 413 693 27:59
❌​ aws-ucws linux 3 3 14 2 583 568 89:25
❌​ aws-ucws windows 3 6 11 2 585 566 91:31
💚​ azure linux 12 3 413 694 36:16
🟨​ azure windows 4 8 3 415 692 30:51
❌​ azure-ucws linux 3 1 11 3 581 567 74:13
❌​ azure-ucws windows 3 4 2 8 3 581 565 78:10
💚​ gcp linux 12 3 402 700 33:43
🟨​ gcp windows 4 8 3 404 698 31:39
24 interesting tests: 12 RECOVERED, 6 KNOWN, 3 FAIL, 2 flaky, 1 SKIP
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 🟨​K 🟨​K 💚​R 🟨​K 🟨​K 🟨​K 💚​R 🟨​K
💚​ TestAccept/bundle/deployment/bind/alert 🙈​S 🙈​S 🙈​S 🙈​S 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🟨​ TestAccept/bundle/generate/alert 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestAccept/bundle/generate/alert/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestAccept/bundle/generate/alert/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
💚​ TestAccept/bundle/resources/alerts/basic 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/basic/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/basic/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/with_file 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 🟨​K 🟨​K 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K 🟨​K
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
❌​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​s 🙈​s ❌​F ❌​F 🙈​s 🙈​s ❌​F ❌​F 🙈​s 🙈​s
❌​ TestAccept/bundle/resources/synced_database_tables/basic/DATABRICKS_BUNDLE_ENGINE=direct ❌​F ❌​F ❌​F ❌​F
❌​ TestAccept/bundle/resources/synced_database_tables/basic/DATABRICKS_BUNDLE_ENGINE=terraform ❌​F ❌​F ❌​F ❌​F
🔄​ TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=yes/PY=no/READPLAN=1 🙈​s 🙈​s ✅​p ✅​p 🙈​s 🙈​s ✅​p 🔄​f 🙈​s 🙈​s
🔄​ TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=no/PY=yes/READPLAN= 🙈​s 🙈​s ✅​p ✅​p 🙈​s 🙈​s ✅​p 🔄​f 🙈​s 🙈​s
💚​ TestAccept/ssh/connection 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
Top 50 slowest tests (at least 2 minutes):
duration env testname
6:43 azure linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
6:01 gcp linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
6:01 aws-ucws windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
5:48 gcp windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
5:34 gcp windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:29 gcp linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
4:32 azure linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
4:03 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:52 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:46 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=yes/PY=yes/READPLAN=
3:35 azure-ucws windows TestAccept/bundle/resources/models/basic/DATABRICKS_BUNDLE_ENGINE=terraform
3:32 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:31 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:24 aws-ucws linux TestAccept/bundle/resources/registered_models/basic/DATABRICKS_BUNDLE_ENGINE=terraform
3:19 azure-ucws linux TestAccept/bundle/resources/registered_models/basic/DATABRICKS_BUNDLE_ENGINE=terraform
3:08 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=yes/PY=no/READPLAN=1
3:06 azure-ucws linux TestAccept/bundle/resources/secret_scopes/permissions/DATABRICKS_BUNDLE_ENGINE=direct
3:04 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:03 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:58 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:58 aws-ucws linux TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.12
2:57 aws-ucws windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.13
2:57 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=no/PY=no/READPLAN=
2:55 gcp windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=no/PY=yes/READPLAN=
2:55 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=no/PY=no/READPLAN=
2:55 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:55 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=yes/PY=yes/READPLAN=
2:54 aws-ucws windows TestAccept/bundle/resources/quality_monitors/change_output_schema_name/DATABRICKS_BUNDLE_ENGINE=direct
2:51 aws-ucws windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct/UV_PYTHON=3.11
2:50 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:49 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=no/PY=yes/READPLAN=
2:49 aws-ucws windows TestAccept/bundle/resources/experiments/basic/DATABRICKS_BUNDLE_ENGINE=terraform
2:48 aws windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
2:47 azure-ucws windows TestAccept/bundle/resources/experiments/basic/DATABRICKS_BUNDLE_ENGINE=direct
2:47 aws-ucws windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct/UV_PYTHON=3.10
2:45 aws-ucws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=yes/PY=yes/READPLAN=
2:43 aws-ucws linux TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.9
2:43 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:43 aws windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.10
2:42 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=no/PY=yes/READPLAN=
2:42 aws-ucws windows TestAccept/bundle/resources/registered_models/basic/DATABRICKS_BUNDLE_ENGINE=terraform
2:42 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=yes/PY=no/READPLAN=
2:42 aws-ucws linux TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=no/PY=no/READPLAN=1
2:42 aws-ucws windows TestAccept/bundle/resources/registered_models/basic/DATABRICKS_BUNDLE_ENGINE=direct
2:39 azure-ucws linux TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct/UV_PYTHON=3.11
2:38 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:37 azure-ucws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=yes/PY=no/READPLAN=
2:36 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=direct/DLT=no/NBOOK=yes/PY=no/READPLAN=
2:35 aws-ucws windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct/UV_PYTHON=3.9
2:35 aws-ucws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=yes/PY=no/READPLAN=

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants