
Your AI answered. But did it miss the mark?
Measure task completion, agent workflow behavior, and output quality with evaluations connected to trace context so you can identify weak responses and improve AI performance.
Built for teams working in .NET, Python, and JavaScript. No credit card required. 5-minute setup. Free for small teams
A response can look successful even if the agent skips a required step, uses the wrong tool, misses an escalation, or produces a weak final answer. Without evaluation signals, teams miss regressions in both output quality and workflow outcomes.
Connect evaluation scores to trace data so your teams can ship more confidently and optimize quickly.
"We use custom spans to track Tenant-level interactions, which helps us quickly understand issues, take corrective action, and continuously improve system performance."
Jeremy Schaab
Vice President Software Development, FYIsoft
Progress AI Observability fits into your existing agent workflows with lightweight SDKs for .NET, Python, and JavaScript. Start capturing execution data quickly, then use it to understand, debug, and improve agent behavior.
Instrument your AI agents with lightweight integrations that capture prompts, model calls, todiv usage, retrieval steps and state.
Observe agent behavior end to end using session- and trace-level views designed specifically for multi-step and multi-agent workflows.
Improve reliability, performance, and cost by debugging failures, running evaluations and tuning orchestration and model choices using real production data.
Get Started in Minutes
// .NET - Install & Instrument
// 1. Install
dotnet add package Progress.Observability.Instrumentation
// 2. Instrument
chatClient = chatClient.AddObservability(options =>
{
options.AppName = Environment.GetEnvironmentVariable("OBSERVABILITY_APP_NAME")!;
options.ApiKey = Environment.GetEnvironmentVariable("OBSERVABILITY_API_KEY")!;
});
# Python - Install & Instrument
# 1. Install
pip install progress-observability
# 2. Instrument
from progress_observability import Observability; import os
Observability.instrument(
app_name=os.getenv("OBSERVABILITY_APP_NAME"),
api_key=os.getenv("OBSERVABILITY_API_KEY")
)
// TypeScript - Install & Instrument
// 1. Install
npm install progress-observability
// 2. Instrument
import { Observability } from 'progress-observability';
Observability.instrument({
appName: process.env.OBSERVABILITY_APP_NAME,
apiKey: process.env.OBSERVABILITY_API_KEY
});
Works with Azure, OpenAI, Anthropic and other major LLM providers.
Use evaluator models, built-in templates, and custom criteria to score outputs for relevance, helpfulness, groundedness, safety, task completion, and domain-specific quality.
Run evaluations on historical traces and new production activity to monitor quality trends, detect regressions, and understand how outputs change over time.
Create reusable datasets from production traces, weak outputs, and known failures, then compare prompts, models, tools, retrieval settings, and workflow changes before release.
Connect scores back to the traces, spans, prompts, retrieved context, tool calls, latency, token usage, and outputs that shaped the result.
Review verdicts, ratings, explanations, quality trends, hallucination risk, safety violations, escalation accuracy, and other reliability signals.
Measure task completion, correct tool use, successful handoffs, recovery from errors, multi-turn success, and robustness across real workflows.
Evaluations show how outputs perform, so teams can improve what ships.
Trace and observe
Debug
Control costs
Evaluate and Improve
Connected Evidence
Progress AI Observability makes it easy to get started with flexible, affordable pricing that grows with your needs.
per month
Includes 10,000 units
Retention: 7 days
per month
Includes 200,000 units
Retention: 30 days
$8 USD per additional 100K units
per month
Includes 1,000,000 units
Retention: 60 days
$8 USD per additional 100K units
per month
Custom trace volume
Retention: Infinite
The most common questions teams ask when evaluating AI observability for production agents.
Built for .NET, Python, and JavaScript.