Introduction to the OpenTelemetry Sum Connector

Introduction to the Sum Connector

When you have a piece of data tucked into your logs or span tags, how do you dig for that bounty of insight today? Commonly this sort of data will be numeric, like a purchase total or number of units. Wouldn’t it be nice to easily turn that data into a metric timeseries? The Sum Connector in OpenTelemetry does just that, allowing you to create sums from attributes attached to logs, spans, span events, and even data points!

In this blog post we’ll run down how to sum attribute values inside the OpenTelemetry collector by going over the following use case:

To sum up (these puns won’t stop); in this case we’ll be using data from our trace span tag attributes to create metrics. From those metrics, we’ll be able to derive a number of useful business metrics we didn’t have access to previously. In other words, we’re uncovering buried treasure in our telemetry! Why would you re-ingest this business data into your observability system? Read on to find out!

How does the Sum Connector work?

Before we get to an example OpenTelemetry configuration, let's quickly go over what the Sum Connector does and how it should fit into your telemetry pipeline. At its most basic, this connector can be used to transform telemetry from one type to another. For example:

This is done by leveraging the attributes attached to these types of telemetry. A source_attribute designates which attribute the numerical value for a new metric would come from. In our use cases, we’ll be using source attributes called order.total and discount.total from our spans to denote the total purchase before discount and the discount applied to the purchase. Additionally, we’ll leverage the attribute from promo.code to keep track of which discount was applied for any given time series by attaching it as an attribute to our new metrics derived from the source_attribute.

Figure 1-1. Our newly created metrics for purchase.order.total and purchase.discount.total, along with a dimension attribute for promo.code.

Connecting the pieces with an example

With that quick recap in place, let's walk through an example OpenTelemetry configuration for what was described above:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "${SPLUNK_LISTEN_INTERFACE}:4317"
      http:
        endpoint: "${SPLUNK_LISTEN_INTERFACE}:4318"

connectors:
  sum/totals:
    spans:
      purchase.order.total:
        source_attribute: order.total
        conditions:
          - attributes["order.total"] != "NULL"
        attributes:
          - key: promo.code
            default_value: none

  sum/discounts:
    spans:
      purchase.discount.total:
        source_attribute: discount.total
        conditions:
          - attributes["discount.total"] != "NULL"
        attributes:
          - key: promo.code
            default_value: none

exporters:
  # Traces
  sapm:
    access_token: "${SPLUNK_ACCESS_TOKEN}"
    endpoint: "${SPLUNK_TRACE_URL}"
  signalfx:
    access_token: "${SPLUNK_ACCESS_TOKEN}"
    api_url: "${SPLUNK_API_URL}"
    ingest_url: "https://ingest.us1.signalfx.com"

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [sapm, sum/totals, sum/discounts]
    metrics:
      receivers: [otlp, sum/totals, sum/discounts]
      exporters: [signalfx]
  1. Our receivers are setup to receive otlp data

  2. We’ve created two connectors for sum/totals and sum/discounts which work very similarly so we’ll focus on sum/totals which looks at our spans with the intent of creating a metric named purchase.order.total.

    • Whenever a span contains a span tag attribute that matches source_attribute the connector will pull the numerical value from the field (E.G. order.total) and use that as the metric value for the time series as long as the value is not NULL
    • The attributes section allows us to pass along the span tag attribute for promo.code as an attribute of our metric time series. If there is no attribute to pull a value from it will default to none
  3. Our traces pipeline receives the otlp data and sends it into a couple of different exporters

    • The sapm exporter sends the trace to our monitoring backend (Splunk Observability Cloud)
    • The sum/totals and sum/discounts exporters in that pipeline are actually connectors defined earlier in the configuration and perform as noted above
  4. Our metrics pipeline receives data from otlp and our connectors (sum/totals and sum/discounts)

And with that, we’ll start to see our new metrics in the Splunk Observability and can operationalize them for trending and reporting or even business level alerting.

We can now track the total revenue against our application and infrastructure metrics and quickly correlate dips in revenue to incidents or changes in the environment. Even further, as a business level example, if our friends in marketing wanted to know what percentage of total sales is using a given promotion code in real time we can quickly chart that. We can also use this data to tie infrastructure and application performance data to critical business metrics. With that same charting method we could create an alert to let us know when promotions or non-promoted sales reach an unusually low level. In Figure 1-2 you can see what this sort of calculation might look like in practice.

Figure 1-2. With our newly created metrics for purchase.order.total along with a dimension attribute for promo.code we can quickly see the percentage of purchases using any given promotion.

Where else is summing attributes useful?

Ultimately your uses for summing and the sum connector are specific to your business and architecture. Our example above is fairly intuitive and common, but there are countless use cases! Here are a couple of examples to get your brain going:

What sort of data do you have hiding in the telemetry you’re already using for monitoring? What sort of logging or tracing data would you like to more simply chart in various ways? Using the sum connector, you can uncover and operationalize entirely new observability data from traditional sources like applications and infrastructure, but also non-traditional sources like mainframes, business processes, and generally anything else that emits logging data.

Next steps

If you’re interested in uncovering the buried treasures in your already existing observability data you can leverage the OpenTelemetry sum collector along with the vast powers of Splunk Observability to dig deeper than ever before! Sign up to start a free trial of Splunk Observability Cloud and you’ll be uncovering sum-thing incredible in no time!

For additional help turning telemetry into count metrics see the previous post on counting telemetry attributes with OpenTelemetry!

This blog post was authored by Jeremy Hicks, Staff Observability Strategist Engineer at Splunk with special thanks to: Curtis Robert and Sam Halpern

Related Articles

DORA, Operational Resilience and Intelligent Observability
Observability
14 Minute Read

DORA, Operational Resilience and Intelligent Observability

DORA brings harmonisation of the rules relating to operational resilience for the financial sector applying to 20 different types of financial entities and ICT third-party service providers and while press discussion (and even the press releases from the EU itself) has emphasised the security dimension of operational resilience, a close reading of the texts associated with the regulation demonstrates an equal if not greater focus on the need for intelligent (i.e. AI enhanced) observability.
Resilient by Design: The Role of AI and Security in Observability for Plant Operations
Observability
8 Minute Read

Resilient by Design: The Role of AI and Security in Observability for Plant Operations

Ready to elevate your observability strategy and drive digital resilience? Explore how Cisco and Splunk solutions can transform your plant operations.
Keep Your Data Volume in Check With Metrics Usage Analytics
Observability
3 Minute Read

Keep Your Data Volume in Check With Metrics Usage Analytics

Metrics Usage Analytics offers platform engineers a powerful tool to assess their organization's metric usage, with insights designed to reduce the impact of data cardinality on your bottom line.