Skip to main content

AI Maintained Skills

Raysurfer provides AI maintained skills for vertical agents — retrieve proven code from prior runs instead of regenerating from scratch. Every snippet earns a reputation through execution results and community voting, so the best code rises to the top.
Without Raysurfer:  Task → Agent generates code from scratch → Runs it
With Raysurfer:     Task → Agent searches for proven code → Runs it (instant)
Queries don’t need to be identical — Raysurfer uses semantic matching to find related patterns. “Fetch user data from the API” and “Get customer info from the endpoint” return the same proven code.

The Ideal Pattern

The best way to understand Raysurfer’s value is through this pattern:
llm_input = params
out1 = query_customer_records_from_postgres_database(llm_input)
out2 = lookup_contact_details_in_local_csv_file(out1[2])
out3 = fetch_purchase_history_from_stripe_api(out2.files[-1])
out4 = calculate_loyalty_tier_from_transaction_count(out3.length)

return out4
This is a multi-step workflow where each step depends on the previous step’s output:
  1. First run: The agent generates this code, executes it, and Raysurfer caches the entire block
  2. Second run: Raysurfer retrieves the cached code and executes it directly — no LLM regeneration needed
Without Raysurfer, the agent iterates through the LLM multiple times: generating code, checking if it compiles, running it, handling errors, retrying. Each iteration costs tokens and time. With Raysurfer, the agent retrieves proven code and runs it immediately — eliminating the trial-and-error loop entirely.

The Loop

Every execution follows this cycle:
Search proven code → Execute → Score → Promote top patterns
Over time, agents build up a library of proven code. High-reputation snippets get prioritized, and the agent starts from high-signal code paths instead of improvising each time.

Under the Hood

The mechanism behind this is a semantic code cache. When your agent generates code that works, Raysurfer caches it with semantic embeddings. When a similar task comes up later — from you or from the community — the proven code is retrieved instead of regenerated. You don’t need to manage anything. Caching and retrieval happen automatically.

Why This Works

Production agents in 2026 are running longer and longer — multi-step workflows, complex tool chains, sprawling context windows. But here’s the thing: the median run has a typical shape. Despite all the complexity, most runs follow similar patterns. Raysurfer captures these common patterns as proven code that your agent reuses instead of regenerating. This gives you:
  • Consistent behavior — the agent starts from proven code instead of improvising each time
  • Better context management — intermediate outputs between API calls aren’t printed by default, keeping context clean
  • Faster execution — skip the generation, go straight to the result

Well-Documented Tool Schemas

For cached code to be reusable, the tool functions it calls must have well-documented input and output schemas. This lets agents extract specific properties from one tool’s response to pass into the next.
from dataclasses import dataclass

@dataclass
class SalesRecord:
    id: str
    amount: float
    quarter: str
    region: str

@dataclass
class Report:
    title: str
    records: list[SalesRecord]
    total: float

def generate_report(records: list[SalesRecord], template: str) -> Report:
    ...
When types are specific, the agent generates code that correctly accesses fields like record.amount instead of guessing with record["amount"] or record.get("amount", 0).

Quality Over Time

Raysurfer tracks which cached outputs work well:
  • Outputs that produce useful results get prioritized
  • Outputs that fail or produce unhelpful results get deprioritized
  • The system improves automatically
Reputation scores track whether outputs were genuinely helpful to users, not just whether they executed successfully. Code that technically runs but doesn’t solve the user’s problem gets deprioritized.

Verified Snippets

Every cached output is verified before reuse:
  • Only successful executions get cached
  • Failed outputs are automatically excluded
  • Your agent builds a library of proven code over time

Immutable Snippets

Code snippets in Raysurfer are immutable — they cannot be edited after creation. When you need a modified version of existing code, a new snippet is created instead. This means:
  • Edits create new snippets — modifying cached code doesn’t overwrite the original; it creates a new entry that surfaces for different types of requests
  • Original code stays intact — the original snippet remains available with its existing reputation scores
  • Version history is automatic — each iteration of a code pattern becomes its own snippet, and the best-performing version rises to the top through voting
You can view all your snippets in the dashboard.

Snippet Scoring

Every snippet has a reputation score that determines its priority in retrieval.

Automatic Voting

The system automatically votes in two scenarios:
  • On cache reuse — when cached code runs successfully and produces the expected output, it receives a thumbs up; failures receive a thumbs down
  • On upload — when new code is stored after a successful execution, Raysurfer’s AI evaluates and votes on it (controlled by use_raysurfer_ai_voting, on by default). You can also provide your own votes via user_vote / user_votes, which skips AI voting.
Both happen in the background after each execution — you don’t need to do anything. The SDK also captures execution logs (tool outputs) and sends them to the backend as context for voting.

Manual Voting

You can also manually adjust scores in the dashboard:
  • Add positive score to promote high-quality snippets
  • Add negative score to demote problematic code
  • Override automatic votes when you know better than the AI
The reputation score is calculated as:
reputation_score = thumbs_up / (thumbs_up + thumbs_down)
Snippets with higher reputation scores are prioritized during retrieval, ensuring your agent uses the most reliable code.

Per-Function Signals

Snippet-level votes tell you whether the overall script was useful. For iterative agents, you can also enable per-function reputation to surface diagnostics inside each function and track function-level telemetry. Use per_function_reputation / perFunctionReputation on upload and search. See Per-Function Reputation for copy-paste examples.

Public Snippets — The Community Registry

The skills library extends to a public registry of validated code patterns sourced from popular open-source GitHub repositories. These are community-contributed skills that any agent can search and use. Enable public snippets by setting public_snips in the SDK constructor:
from raysurfer import RaySurfer

client = RaySurfer(api_key="your_key", public_snips=True)
result = client.search(task="Parse CSV and generate chart")
Public snippets are read-only — they cannot be edited or deleted by users. They are curated by the Raysurfer team and ranked alongside your private snippets using the same voting system — the best answers rise to the top. Also available via the CLI (--public flag) and MCP server (public_snips parameter).