0/ This is a thread about why tracing will gradually replace most logging, at least where distributed or cloud-native architectures are concerned. And we’re going to explore this through the lens of a relational data model.

It’s going to be fun!

Thread: https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">
1/ The best logging is always *structured* logging. That is, logging statements are most useful if they encode key:value pairs which can then be queried and *analyzed* in the aggregate.

(Even for plain, textual logs, NLP and stats can extract basic structure.)
2/ A structured log implicitly defines a *relational table*, with the keys for each attribute defining the columns, and the values for each log line defining rows in this (theoretical) table.

Like this:
3/ And, naturally, there are a number of implicit columns in our table as well. Things like host, timestamp, etc:
4/ Now, to be clear, we’re talking about the “abstract idea” of relational tables here, and not actually inserting every log line into mysql or similar – that would be a disaster at scale. :)

Just think of each line of logging instrumentation as a “table schema.”
5/ Once we realize this, we can write queries with *most* SQL niceties (WHERE filters, GROUP BY aggregations, etc).

But what about “JOIN”? How does *that* work in logging systems? The long answer won’t fit here.

The short answer? “Poorly.” Bummer. :-/
6/ Why is it a bummer? Well, because when we’re instrumenting a microservice, by definition *we only have access to data from that microservice!*

What about version numbers of peer services? Or request customer_ids? Or downstream feature flags? Surely those could be relevant…
7/ But relevant or not, that data lives *in other services.* Which means it’s not there to log. What’s an eng to do??

Faced with this conundrum, engineers stuck with logs will inevitably/sadly hack something together rather than address the underlying structural issue. (https://abs.twimg.com/emoji/v2/... draggable="false" alt="😭" title="Laut schreiendes Gesicht" aria-label="Emoji: Laut schreiendes Gesicht">)
8/ E.g., have you ever seen a customer_id painstakingly propagated across function and *process* boundaries just so someone can add it to instrumentation?

That’s an error-prone *and* expensive way of implementing log JOINs via app code (rather than automatically via tracing).
9/ When we implement JOIN manually in this way, we are taking on *literally the hardest part of distributed tracing instrumentation* (namely, “context propagation”) and trying to manage it via one-off hacks. It doesn’t end well. (TL;DR “use @opentelemetry instead”)
10/ So again, “that’s wasteful.” And ineffective.

The right way to solve this problem is to leverage distributed tracing to perform a much (much) more powerful JOIN.

Let’s imagine that your system looks like this:
11/ Now, when a truly modern observability solution “assembles a trace,” it’s *really* executing a JOIN across the entire *distributed* transaction, and thus populating a wider and more powerful table: one with columns from every Span that participates in the trace.

Like this:
12/ Now, when people think about tracing, they tend to think about this giant table “one trace (or row) at a time.”

Imagine restricting a logging system to display only one log-line at a time. This is just as bad… perhaps worse. And yet it passes for “tracing.” :-/
13/ It’s really only in the past few years that observability technology has developed to the point that these massive, *distributed*, tables can be hydrated both dynamically and in real-time.
14/ And all of that data engineering is worth it! Because when the relational tables are as wide as your distributed system is deep, amazing things are possible – and I don’t see how logging will ever be able to catch up.
PS/ For example applications of these sorts of dynamic, relational tables, see any of the following (or play with http://lightstep.com/sandbox )

https://lightstep.com/sandbox&q... href=" https://twitter.com/el_bhs/status/1364282343196827650
https://twitter.com/el_bhs/st... href=" https://twitter.com/el_bhs/status/1227358990968877056
https://twitter.com/el_bhs/st... href=" https://lightstep.com/blog/announcing-lightsteps-change-intelligence/">https://lightstep.com/blog/anno...
You can follow @el_bhs.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: