Splunk Observability Update (Q1 2026): Deeper Insights for AI Agents and Digital Experiences

In our Q1 2026 update, we’re excited to highlight the latest innovations across Splunk Observability, with a strong focus on helping teams operate confidently in an increasingly AI-powered world.

As teams begin building applications powered by LLMs and AI agents, they are encountering a new set of observability challenges that go far beyond traditional applications. To address this, we’re introducing AI agent and infrastructure monitoring to help teams monitor the performance, quality, cost, and security risks of AI-powered applications and move from experimentation to production. These releases include:

AI is also transforming how teams troubleshoot and resolve incidents. With the new Troubleshooting Agent in Observability Cloud and support for the Splunk MCP Server, we’re modernizing what has traditionally been a manual and error-prone process. The Troubleshooting Agent automatically correlates signals across metrics, events, logs, and traces, surfaces likely root causes, and recommends next steps, helping teams resolve issues faster and with greater confidence. These capabilities include:

Finally, meaningful AI insights depend on unified observability data to provide full context. That is why we are announcing several new capabilities that deliver deeper, end-to-end visibility across the entire application stack, from business outcomes and digital experience to applications, infrastructure, and networks. Highlights include new Digital Experience Analytics capabilities that reveal how users truly interact with your applications, as well as an ITSI content pack for Cisco Data Center Networking that helps ensure network service health across network domains.

Observability for AI

AI is everywhere, reshaping software development, customer support, and business workflows. But it has also introduced “AI slop”—inauthentic, inaccurate, low-quality, and sometimes harmful outputs that are increasingly hard to detect at scale. As AI systems become more autonomous and complex, teams must ensure models, agents, and supporting infrastructure operate reliably, safely, and cost-effectively in line with business goals. This requires observability to evolve, providing deeper context and unified visibility that correlates business outcomes with performance, quality, security, and cost metrics across the AI stack.

Monitor the performance, quality, security, and cost of LLM and agentic applications with AI Agent Monitoring in Observability Cloud (GA)

AI Agent Monitoring in Observability Cloud ensures teams can pinpoint and correlate the root cause of unreliable or degraded AI agent and model performance. Teams can evaluate responses and track performance metrics such as latency and errors alongside quality and security metrics like hallucinations, bias, drift, and accuracy, as well as cost and token usage metrics.

By tracing and mapping dependencies to analyze LLM and agentic workflows, tool calls, and other service level objectives (SLOs), teams will be able to correlate and improve model quality and behavior with business impact to build trust and reliability.

Finally, Cisco AI Defense will be integrated with AI Agent Monitoring in Observability Cloud so that teams can comply with AI standards, and detect and mitigate AI (LLM, agents and tools) risks, misuse, drifts, leakage, and threats like personally identifiable information (PII) leakage, prompt injection, or policy violations in real time.

AI Agent Monitoring in Observability Cloud is now generally available (GA). It’s also available in AppDynamics for on-premises customers. Read the blog to learn more, get started today or explore the product with an interactive tour.

Monitor AI Infrastructure health, availability, and consumption in Observability Cloud, including Cisco AI PODs (GA)

Cisco AI PODs are pre-validated, full-stack data center infrastructure solutions designed to support the entire AI lifecycle, including training, fine-tuning, and inferencing workloads. AI Infrastructure Monitoring provides teams with data-dense dashboards and detectors for orchestration frameworks, agents, model providers, vector databases, GPUs, and more to surface trends in underlying AI infrastructure performance.

In November 2025, we added support to monitor Cisco AI PODs, as well as Nvidia NIMs, Milvus and Pinecone vector databases, LiteLLM proxy services, GCP VertexAI applications, and more. These views allow teams to track “tokenomics” metrics like time-to-first token and estimated token costs, and operational metrics such as throughput and GPU and memory utilization from AI apps and services to manage costs, optimize resources, and detect performance degradation. By visualizing the business impact of AI components, teams will be able to assess the utilization and efficiency of their Cisco AI PODs and other hosted AI infrastructure. AI Infrastructure Monitoring is generally available (GA). Set up AI Infrastructure Monitoring today and read this blog to learn more about AI Infrastructure Monitoring for Cisco AI PODs.

AI-powered Observability

Historically, incident response was reactive, forcing teams to troubleshoot across siloed systems and manually correlate overwhelming volumes of telemetry data. This made root cause analysis slow, labor-intensive, and hard to scale. As modern teams ship faster into increasingly complex, AI-powered environments, they need troubleshooting workflows that are faster, more accurate, and repeatable.

Investigate and troubleshoot faster with AI agents

AI troubleshooting agent and remediation plan in Observability Cloud

Rather than sifting through dashboards, searching for relevant application and infrastructure metrics or logs, or switching between multiple screens, tabs, and tools to piece together possible causes, the AI troubleshooting agent and remediation plan in Observability Cloud is a fellow SRE that pinpoints evidence-based root cause. When an alert triggers, the troubleshooting agent analyzes metrics, events, logs, and traces to generate suspected root causes, evidence-backed summaries of the immediate impact of the incident on affected services and user sessions, and a human-verified action plan for long-term resolution. This is presented in an interactive manner to ensure on-call engineers understand what happened and how to fix it—all in plain language.

Whether you’re a new engineer joining the team, a service owner who cares about business impact, or an on-call engineer, the interactive workflow will enable you to understand root causes and remediation in a timely manner.

AI troubleshooting agents in AppDynamics

When anomalies or health rule violations occur in an n-tier or hybrid environment, the AI troubleshooting agent for AppDynamics goes beyond simply flagging an issue. It explains what’s happening, why it’s happening, and how to fix it. Like the AI troubleshooting agent and remediation plan in Observability Cloud, teams can receive clear, concise root cause summaries in AppDynamics right where you need them, without bouncing between dashboards or guessing which metric matters most.

The result? Teams resolve issues faster and with greater accuracy, even if they aren’t system experts. It removes knowledge barriers, guides investigations with just one click, and presents actionable insights in the context of real-time data.

Read this blog to learn more.

Connect your AI Agents to Observability Cloud capabilities on Splunk MCP Server (Alpha)

AI agents now have one unified MCP server to access Splunk’s tool capabilities. Splunk MCP Server provides a secure and scalable interface for connecting AI assistants, agents, and other intelligent systems to Splunk data, and now specifically to Observability Cloud capabilities. This provides teams with the flexibility to seamlessly integrate their local AI agents, LLMs, and tools, and data with Observability Cloud data to build custom AI workflows and debug performance issues in production without leaving their environment. Observability Cloud capabilities on Splunk MCP Server are in Alpha. Learn more about Splunk MCP Server in our documentation.

Unified Observability Experience

Modern systems span applications, infrastructure, networks, and cloud services, but observability data often remains fragmented across tools and teams. These silos make troubleshooting slow and manual, forcing engineers to piece together partial signals without a complete picture. At the same time, effective AI-driven insights depend on unified, high-quality data with full context. A unified observability experience brings these signals together, enabling faster root cause analysis and powering AI that can deliver accurate, actionable guidance.

Unify user behavior and observability data with Digital Experience Analytics in Observability Cloud (GA)

To deliver great digital experiences, teams need visibility from every angle, combining traditional observability signals like latency, JavaScript errors, and load times with insights into user behavior, including feature adoption, friction points, and drivers of drop-offs or conversions.

Digital Experience Analytics in Splunk Observability Cloud provides deep, intuitive insight into user behavior across digital products. By unifying behavioral data with observability signals from real user monitoring (RUM) and application performance monitoring (APM), Splunk helps Product, UX, and Engineering teams to pinpoint friction, increase conversion rates, and prioritize engineering work more effectively. All of this is powered through a single, lightweight OpenTelemetry instrumentation agent for easy setup and minimal overhead.

Digital Experience Analytics will be generally available in Observability Cloud in March. Read the blog to learn more or explore the documentation to get started.

Ship safer applications faster with Secure Application in Splunk Observability Cloud (GA)

Secure Application in Splunk Observability Cloud helps engineering teams improve SLA compliance and reduce application security risks without the need to deploy and manage additional agents. By embedding runtime application security directly into observability data, teams can detect vulnerabilities within code in real time and map them directly to application services, including open-source frameworks and third-party libraries, to prioritize and quickly respond to critical risks. This means you’re always protected against the latest risks, without manually checking every package or waiting for the next scan.

Secure Application is now generally available for Observability Cloud.  Learn more about Secure Application in the Stop Chasing Ghosts: Prioritize Real Risks in the AI Era blog or explore the product with an interactive tour.

Ensure network service health across domains with the Splunk IT Service Intelligence Content Pack for Cisco Data Center Networking (GA)

The Splunk IT Service Intelligence Content Pack for Cisco Data Center Networking helps ITOps teams transition to cross-domain network service assurance. It enables centralized problem identification, isolation, and guided troubleshooting in a single place.

With out-of-the-box connectors and service templates that include prebuilt KPIs and entity filters, the content pack streamlines ingestion of Cisco Nexus Dashboard telemetry into IT Service Intelligence. This allows teams to quickly build service topologies and gain actionable insights into how network issues impact business services, helping ensure seamless operations.

By consolidating and correlating network alerts, the content pack simplifies incident management and reduces alert noise. It automatically normalizes networking data to establish ML-powered, dynamic network health thresholds that surface critical incidents and accelerate root cause analysis, empowering teams to deliver the always-on experiences your business demands.

The Splunk IT Service Intelligence Content Pack for Cisco Data Center Networking is generally available.

Gain full mobile visibility with new React Native and Flutter support in Observability Cloud (Alpha)

React Native and Flutter are two of the most widely used hybrid mobile development frameworks in the market. With this release, Real User Monitoring in Observability Cloud now provides first-class support for both, delivering the same level of visibility and depth of insights previously available only for native iOS and Android apps.

Engineering teams can now monitor crashes, slow load times, network failures, and other performance issues across their React Native and Flutter apps, while also capturing detailed user session data to accelerate troubleshooting and root cause analysis. This capability is currently in Alpha. Sign up for the Alpha program at the Splunk Voice of the Customer site.

Connect application performance to business impact with Business Insights in Observability Cloud (Alpha)

In a sea of alerts and metrics, understanding what truly matters to the business can feel impossible. Business Insights helps engineering and business teams focus on bottom-line impact by showing, in real time, how application performance affects critical business processes. With this release, Business Insights enables teams to visualize long-running processes, such as loan or credit approvals, and connect technical performance to business KPIs in a single, unified journey view. Going beyond traditional SLA metrics, it provides a clear view of revenue impact so teams can focus on the issues that matter most. Business Insights will be available in Alpha in March. Request access at the Splunk Voice of the Customer site.

Troubleshoot Kubernetes environments faster with native Kubernetes alering in AppDynamics (GA)

Previously, monitoring Kubernetes entities in AppDynamics required manual metric tracking and offered limited alerting, making issue detection slow and inefficient. AppDynamics now ingests metrics for key Kubernetes entities - clusters, namespaces, workloads, and pods - out of the box. Teams can define health rules directly on these entities and receive real-time alerts when violations occur, improving visibility into Kubernetes health. Event exploration and metric visualization accelerate troubleshooting, while upcoming dashboard and drill-down enhancements will enable deeper root cause analysis across entity hierarchies.

With native Kubernetes monitoring, AppDynamics delivers streamlined health monitoring, proactive alerting, and simplified troubleshooting in a single platform. For more information on Kubernetes Monitoring with AppDynamics, check the documentation.

Conclusion

As AI-powered applications and modern systems continue to grow in complexity, observability must deliver deeper context, intelligent automation, and unified visibility. This quarter's updates bring powerful new capabilities across AI monitoring, troubleshooting, digital experience, and full-stack observability, helping teams operate with greater confidence and speed.

Start a free trial or book a demo today to see how Splunk Observability can help you move faster, troubleshoot smarter, and deliver better digital experiences.

Many of the products and features described herein remain in varying stages of development and will be offered on a when-and-if-available basis. The delivery timeline of these products and features is subject to change at the sole discretion of Cisco, and Cisco will have no liability for delay in the delivery or failure to deliver any of the products or features set forth in this document.

Related Articles

Monitoring Amazon EC2 with Splunk Infrastructure Monitoring
Observability
7 Minute Read

Monitoring Amazon EC2 with Splunk Infrastructure Monitoring

Explore the top 12 challenges of monitoring Amazon EC2 when dealing with larger scale production deployments in part one of this two-part blog series.
Application Performance Redefined: Meet the New SignalFx Microservices APM
Observability
6 Minute Read

Application Performance Redefined: Meet the New SignalFx Microservices APM

Splunk's newest release of the SignalFx Microservices APM introduces innovations like Full Fidelity tracing, AI-Driven Directed Troubleshooting, and open framework instrumentation
Leverage contact center monitoring for world-class customer experiences
Observability
3 Minute Read

Leverage contact center monitoring for world-class customer experiences

When a customer support call goes well, life is good. With Cisco UCCE and AppDynamics, every support call can be a great customer experience.