Instrumentation: the Foundation of Observability

Peter Chittum

November 22, 2023

•

min read

Engineering

Intro

Since the inception of computer science, software engineers have attempted to understand the state and health of software systems. In modern programming, this practice is called observability. Observability systems use instrumentation, code which reports out data about the system, to produce telemetry data. This telemetry data is then aggregated and analyzed to understand system operation. In this article you’ll learn about different types of instrumentation, observability data, and how you can use instrumentation in your Salesforce environment to emit observability data.

‍

Basics of Observability

Generally speaking, software observability deals with the ability to understand the internal state of software systems. Observability systems rely on the timely collection of telemetry information from the system. This information (sometimes called “signals”) comes in the form of metrics, logs and traces. A metric is a point-in-time measurement of something that happened (a method was started/ended, a transaction took n milliseconds). A trace might track the flow of a transaction the system over time (often made up of many metrics). Logs report out text-based information about what is happening.

Metrics and traces are structured data that is constantly emitted passively, no matter the status of the system. Logs, on the other hand, are typically unstructured or semi-structured text-based data. Log messages are reported when there is some error or exception and are local to the problem. Generally, logs are good at diagnosing problems that were anticipated by a developer. On the other hand metrics and traces help diagnose problems that are unexpected, intermittent or not predicted by the developer.

Observability systems take these structured and unstructured data sources and turn them into usable information. For instance, a trace may contain a measurement of how long a piece of logic took to do its work. The observability tool could then aggregate that into the mean or median of all invocations of the same logic, as well as high and low values. Such data could then be used to assess the overall health of that piece of logic.

‍

NOTE: The Open Telemetry Project exists as a way to standardise the structure and APIs of telemetry data. This standardisation makes it easier for software vendors to build observability into their products which can then be detected and used by any one of a number of observability tools.

‍

But what must you do in order to get a system to report out this data? This is where instrumentation comes into view.

‍

A Primer on Instrumentation

The Open Telemetry Project defines instrumentation as the emitting of traces, metrics and logs from a system’s components. This happens in two ways:

Write code to explicitly emit those signals
Use an agent to inject logic into the system that emits the signals you want

‍

Code-based instrumentation is exactly what it sounds like—a bit of code or logic that the developer manually adds to a piece of software for the purposes of producing some runtime execution data.

Agent-based (or sometimes called non-code) instrumentation involves some way of altering the software at compilation, deployment or runtime to emit signals. For instance, the Java Virtual Machine Tooling Interface (JVMTI) bundles bytecode instrumentation, a means of injecting instructions for sending execution measurements at runtime.

As you might imagine, making additional code calls within an existing transaction can have an impact. Care must be taken to ensure that instrumentation incurs a minimal performance impact. Past instrumentation projects have failed where performance was too greatly impacted. A good example of this was in Java, prior to the launch of JVMTI in Java 5.

As non-code instrumentation does not exist in Apex for Salesforce customers, the remainder of this article addresses code-based instrumentation.

‍

Instrumentation and Salesforce Customisations

The native tool developers have to introspect live Apex execution is the debug log (or sometimes called the Apex log). As such, the first, most obvious way that developers who work with Salesforce may have encountered instrumentation is with the `System.debug()` method.

Because of some of the limitations of the debug log such as lack of storage, the limited time window to run a log session, and log truncation, developers often opt for a more feature-rich logging solution like Nebula Logger.

Nebula Logger is a tool which requires instrumentation to be added to an application for the developer to begin to receive signals right in the Salesforce org. Customers adopting Nebula Logger need to carefully plan out each flow of logic where they want to pick up signals in the Nebula Logger custom objects.

That’s right. Believe it or not, every Apex developer has already done some form of instrumentation in their software project.

‍

Processity Data History and Instrumentation

Processity built Data History as a tool to create the most complete and accurate historical log of Salesforce data changes. To do this, Processity Data History needs to know when a transaction starts and when it ends. It will then track the transaction flow through each logic branch and, where changes to records occur, track them.

To flag these start and end points, we instrument Apex triggers, queueable classes and flows.

‍

Here’s an example of what this looks like in its most simple form:

‍

trigger CampaignTrigger on Campaign (
    before insert,
    before update,
    before delete,
    after insert,
    after update,
    after delete,
    after undelete
) {
    mantra.AudicityApex.track();
}

‍

In the case that there is no existing Apex trigger on an object, a simple Apex trigger with a single call to `mantra.AudicityApex.track()` will ensure the instrumentation call happens both in the `before` and `after` trigger context.

Where there is already a trigger or trigger framework, the mantra.AudicityApex.track() call is made at separate points to ensure it is the first and last call of the transaction.

‍

trigger PropertyTrigger on Property__c(
    before insert,
    before update,
    before delete,
    after insert,
    after update,
    after delete,
    after undelete
) {
 
    if (Trigger.isBefore){
        mantra.AudicityApex.track();
    }
    
    PropertyTriggerHandler.handleTrigger(
        Trigger.new,
        Trigger.newMap,
        Trigger.oldMap,
        Trigger.operationType
        );
        
    if (Trigger.isAfter){
        mantra.AudicityApex.track();
    }   
}

‍

Processity Data History has Flow instrumentation as well. The Data History after-save flow actions can also help to ensure the details of a transaction are recorded correctly. This is particularly useful if it is unclear which Apex executes last in a transaction. Since after-save Flows always execute after Apex, it is an effective way to bookend a transaction on the tail end.

‍

Audicity instrumentation actions in a Salesforce Flow — Data History instrumentation actions in a Salesforce Flow

‍

Finally, there is instrumentation for queueable Apex entry points. Here’s an example of a transaction with an asynchronous Apex call as visualized by the Data History transaction viewer.

‍

Salesforce Transaction visualised in the Audicity transaction viewer — Salesforce transaction visualised in the Data History transaction viewer

‍

Building Observability and More

Once you have a good set of telemetry metrics being emitted from your system, good observability is possible. A tool like Processity Data History can help with that. But observability is not a one-dimensional practice. Having a combination of structured data (traces and metrics) and unstructured data (logs) can paint a more complete picture of your system health.

Although Processity Data History is a great complement to your observability tools, this is not our end goal. Processity’s goal is to build the industry standard of business process mining. For this, we required a more complete and accurate history of data changes. It turns out that using the principles of observability we have created just such a data source. It is a happy side effect that it can also be used for observability.

‍

Instrumentation and Platform Events

I lied. There is one more item to address around non-code instrumentation: platform events. Specifically, change data capture (CDC) events.

CDC does present an opportunity to regularly output data changes. You could call this a form of instrumentation and it might have some limited applications. But if someone wanted to understand how the logic and implementation of the code was impacting an org, CDC would not help.

As far as observability goes, there are other obstacles to using CDC, such as additional license costs, incomplete support for standard objects and inability to discern whether a change was initiated by a user or an automation.

‍

Summing Up Instrumentation

Instrumentation is a time-tested technique to obtain critical operational data from software systems. Whilst adding code to track your code requires thoughtful planning and implementation, the small investment can yield significant benefits in the form of higher accuracy, and more complete telemetry data.

‍

Processity Data History, formerly known as Audicity.