🎭 Generative Media Skills for AI Agents

The Ultimate Multimodal Toolset for Claude Code, Cursor, and Gemini CLI.
A high-performance, schema-driven architecture for AI agents to generate, edit, and display professional-grade images, videos, and audio.

🚀 Get Started | 🎨 Expert Library | ⚙️ Core Primitives | 📖 Reference

✨ Key Features

🤖 Agent-Native Design — Standardized terminal scripts with clean JSON outputs for seamless integration into agentic workflows.
🧠 Expert Knowledge Layer — Domain-specific skills that bake in professional cinematography, atomic design, and branding logic.
⚡ Dynamic Schema-Driven — Powered by schema_data.json, scripts automatically resolve the latest models, endpoints, and valid parameters.
🖼️ Direct Media Display — Use the --view flag to automatically download and open generated media in your system viewer.
📁 Local File Support — Auto-upload images, videos, faces, and audio from your local machine to the CDN for processing.
🌈 100+ AI Models — One-click access to Midjourney v7, Flux Pro, Kling 3.0, Veo3, Suno V5, and more.

🏗️ Scalable Architecture

This repository uses a Core/Library split to ensure efficiency and high-signal discovery for LLMs:

⚙️ Core Primitives (`/core`)

The raw infrastructure for interacting with the muapi.ai engine.

core/media/ — High-fidelity Generation (Image, Video, Audio)
core/edit/ — Advanced Editing (Lipsync, Upscale, Effects)
core/platform/ — Setup & Polling Utilities

📚 Expert Library (`/library`)

High-value skills that translate creative intent into technical directives.

Cinema Director (/library/motion/cinema-director/) — Technical film direction & cinematography.
Nano-Banana (/library/visual/nano-banana/) — Reasoning-driven image generation (Gemini 3 Style).
UI Designer (/library/visual/ui-design/) — High-fidelity mobile/web mockups (Atomic Design).
Logo Creator (/library/visual/logo-creator/) — Minimalist vector branding (Geometric Primitives).

🧠 Self-Optimizing Skills

Every expert skill in the Library includes a Prompt Optimization Protocol. This allows LLMs (like Claude or Gemini) to use their own reasoning to expand simple user requests into high-fidelity technical briefs before calling the generation scripts.

🚀 Quick Start

1. Install the Skills

# Install all skills to your AI agent
npx skills add SamurAIGPT/Generative-Media-Skills --all

# Or install a specific skill
npx skills add SamurAIGPT/Generative-Media-Skills --skill muapi-media-generation

# List available skills
npx skills add SamurAIGPT/Generative-Media-Skills --list

# Install to specific agents
npx skills add SamurAIGPT/Generative-Media-Skills --all -a claude-code -a cursor

2. Configure Your API Key

# Get your key at https://muapi.ai/dashboard
bash core/platform/setup.sh --add-key "YOUR_MUAPI_KEY"

3. Run an Expert Skill with Direct Display

Generate a high-fidelity image and open it immediately using the --view flag.

# Use Nano-Banana reasoning to generate a 2K masterpiece from a local image
bash library/visual/nano-banana/scripts/generate-nano-art.sh \
  --file ./my-source-image.jpg \
  --subject "a glass hummingbird" \
  --style "macro photography" \
  --resolution "2k" \
  --view

4. Direct a Cinematic Scene

cd library/motion/cinema-director
# Create a 10-second 'epic' reveal without audio
bash scripts/generate-film.sh \
  --subject "a cybernetic dragon over Tokyo" \
  --intent "epic" \
  --model "kling-v3.0-pro" \
  --duration 10 \
  --no-audio \
  --view

📖 Schema Reference

This repository includes a streamlined schema_data.json that core scripts use at runtime to:

Validate Model IDs: Ensures the requested model exists.
Resolve Endpoints: Automatically maps model names to API endpoints.
Check Parameters: Validates supported aspect_ratio, resolution, and duration values.

🔧 Compatibility

Optimized for the next generation of AI development environments:

Claude Code: Direct terminal execution via tools.
Gemini CLI / Cursor / Windsurf: Seamless integration as local scripts.
MCP: Each skill is Model Context Protocol-ready for universal agent usage.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
core		core
library		library
media_outputs		media_outputs
.env		.env
LICENSE		LICENSE
README.md		README.md
schema_data.json		schema_data.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎭 Generative Media Skills for AI Agents

✨ Key Features

🏗️ Scalable Architecture

⚙️ Core Primitives (`/core`)

📚 Expert Library (`/library`)

🧠 Self-Optimizing Skills

🚀 Quick Start

1. Install the Skills

2. Configure Your API Key

3. Run an Expert Skill with Direct Display

4. Direct a Cinematic Scene

📖 Schema Reference

🔧 Compatibility

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 9

Languages

License

SamurAIGPT/Generative-Media-Skills

Folders and files

Latest commit

History

Repository files navigation

🎭 Generative Media Skills for AI Agents

✨ Key Features

🏗️ Scalable Architecture

⚙️ Core Primitives (/core)

📚 Expert Library (/library)

🧠 Self-Optimizing Skills

🚀 Quick Start

1. Install the Skills

2. Configure Your API Key

3. Run an Expert Skill with Direct Display

4. Direct a Cinematic Scene

📖 Schema Reference

🔧 Compatibility

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 9

Languages

⚙️ Core Primitives (`/core`)

📚 Expert Library (`/library`)

Packages