Create and manage traces for LLM applications, capturing inputs, outputs, and intermediate steps
Log events, inputs, outputs, and metadata within a trace
Evaluate LLM outputs using various metrics like relevance, faithfulness, and context precision
Launch the Opik dashboard to visualize traces and evaluation results
Opik is a comprehensive platform for debugging, evaluating, and monitoring LLM applications, RAG systems, and agentic workflows. It provides detailed tracing capabilities, automated evaluations, and production-ready dashboards to help developers understand and improve their AI systems. With Opik, teams can gain insights into their LLM applications' performance, identify issues, and optimize their systems for better results.
Opik is a powerful tool for debugging, evaluating, and monitoring your LLM applications, RAG systems, and agentic workflows. It provides comprehensive tracing, automated evaluations, and production-ready dashboards to help you understand and improve your AI systems.
To install Opik, you can use pip:
pip install opik
For JavaScript/TypeScript applications, you can install via npm:
npm install @comet-ml/opik
After installation, you can start using Opik by initializing it in your application:
from opik import OpikTracer
# Initialize the tracer
tracer = OpikTracer()
# Create a trace for your LLM application
with tracer.trace("my_llm_app") as trace:
# Your LLM application code here
response = llm.generate("What is the capital of France?")
# Log the response
trace.log(
type="llm_response",
content=response,
metadata={"model": "gpt-4"}
)
import { OpikTracer } from '@comet-ml/opik';
// Initialize the tracer
const tracer = new OpikTracer();
// Create a trace for your LLM application
const trace = tracer.startTrace("my_llm_app");
try {
// Your LLM application code here
const response = await llm.generate("What is the capital of France?");
// Log the response
trace.log({
type: "llm_response",
content: response,
metadata: { model: "gpt-4" }
});
} finally {
// End the trace
await trace.end();
}
Opik provides detailed tracing capabilities that allow you to track the flow of information through your LLM applications. You can log inputs, outputs, and intermediate steps to understand how your system is processing data.
with tracer.trace("rag_system") as trace:
# Log user query
trace.log(type="user_query", content="What is quantum computing?")
# Log document retrieval
docs = retriever.get_relevant_documents("What is quantum computing?")
trace.log(type="retrieved_documents", content=docs)
# Log LLM response
response = llm.generate_with_context(docs, "What is quantum computing?")
trace.log(type="llm_response", content=response)
Opik includes automated evaluation tools to assess the quality of your LLM outputs:
from opik.evaluation import evaluate_rag
# Evaluate RAG system
results = evaluate_rag(
trace_id="trace_123",
metrics=["relevance", "faithfulness", "context_precision"]
)
print(f"Relevance score: {results['relevance']}")
print(f"Faithfulness score: {results['faithfulness']}")
Opik provides production-ready dashboards to visualize your system's performance. You can access these dashboards by running:
opik dashboard
This will start a local server where you can view your traces, evaluation results, and system metrics.
You can configure Opik using environment variables or by passing parameters to the OpikTracer
constructor:
tracer = OpikTracer(
api_key="your_api_key", # Optional: for cloud storage
project_name="my_llm_project",
log_level="INFO"
)
For more advanced configuration, you can create a configuration file:
# opik.yaml
api_key: your_api_key
project_name: my_llm_project
log_level: INFO
storage:
type: local # or "cloud"
path: ./opik_data
You can define custom evaluation metrics:
from opik.evaluation import register_metric
@register_metric
def my_custom_metric(trace, **kwargs):
# Implement your custom metric logic
return score
Opik integrates seamlessly with popular LLM frameworks like LangChain, LlamaIndex, and DSPy:
# LangChain integration
from opik.integrations.langchain import OpikCallbackHandler
callback = OpikCallbackHandler()
chain = LLMChain(llm=llm, prompt=prompt, callbacks=[callback])
For more detailed information and advanced usage, refer to the official documentation.