Sophisticated Android apps are all but impossible to write without some level of multi-threading and many threads frequently cooperate in any given process. This complex threading is often made simpler by using different tools. BugSnag’s Real User Monitoring solution includes waterfall diagrams, which show the breakdown of an operation into its dependent operations. Multi-threading presents a challenge here as the dependencies can cross threads. We can’t simply track things with a ThreadLocal
as the program flow crosses threads, and used threads will likely, subsequently be reused for something else.
Spans are the primitives of performance measurements. Fundamentally, a span is a tagged start and end time. Each span belongs to a trace and can optionally have a “parent” span. The two fields identifying these relationships (parent span id and trace id) are a span’s “context”.
These two relationships allow the spans to be presented as a waterfall diagram. For example, here is a simple waterfall diagram of a trivial Activity
being created:
This example trace includes 4 spans:
Activity - MainActivity
span is the root for this trace and encompasses the entire activity load.ActivityCreate
, ActivityStart
and ActivityResume
are sub-spans that measure the time taken for the onCreate
, onStart
, and onResume
invocations respectively.The “context” of the subspans was inherited from the parent; the trace id is the same subspans and parent, and the parent span id of the subspans is the span id of MainActivity
.
BugSnag’s Real User Monitoring solution automatically tracks the local context of a span so as a developer, you don’t need to think about it too much. This is done by keeping a stack of contexts as a ThreadLocal
. This works perfectly for single-threaded processes (such as creating an Activity
), but anything non-trivial in Android requires multiple threads and, typically, coroutines. This introduces complexity because a trace might measure a single coroutine sequence that hops between several different threads, any of which may generate their own subspans.
We decided to take a bottom-up approach. Instead of trying to automate every possible thread model and toolset (Flow, Rx, Coroutines, Thread pools, etc.) used, we would instead start by building a way to manually control the context and then build out utility libraries on top of it.
Our SpanContext
represents the context of a span, which says the id of its trace and its parent span (if any). The SpanContext.current
static method returns the ThreadLocal SpanContext
, which is the default for any new Span
created using BugsnagPerformance.startSpan
or measureSpan
. The SpanContext
object can be passed between threads safely, and a new Span
can then be started “within” whatever context you like. A simple, but contrived example:
val parentContext = SpanContext.current
// other things happen that may change SpanContext.current
// childSpan will be a child of the span that was open when parentContext was captured
val childSpan = BugsnagPerformance.startSpan("child operation", SpanOptions.DEFAULTS.within(parentContext)) {
// code to be measured goes here
}
Theoretically using this toolbox, you can construct spans hierarchies any way you like. But what you really want to do is to mirror the flow of the program, so here’s how we help with that.
With coroutines, flow can hop around threads, through arbitrarily complex nested layers of launch
, async
and withContext
. We provide a BugsnagPerformanceScope
class to keep your coroutine spans organized. When a class implements CoroutineScope byBugsnagPerformanceScope()
the scope will marshal the SpanContext
across threads and the Spans
created from within the scope will automatically nest as you expect:
class DashboardViewModel : ViewModel(), CoroutineScope by BugsnagPerformanceScope() {
// ...
override fun onCleared() {
super.onCleared()
cancel()
}
private fun attemptLogin(email: String, password: String) {
val loginSpan = BugsnagPerformance.startSpan("Login")
launch {
loginSpan.use {
val loginResult = withContext(Dispatchers.IO) {
LoginApi.login(email, password)
}
// ...
The above code will automatically nest any spans logged by the LoginApi
(such as the HTTP requests) as children of the Login
span:
To allow fork/join coroutine behavior with async
/ awaitAll
, we made SpanContext
compatible with being added to a coroutine context. Just manually pass the current SpanContext
when we “fork” the new tasks using async
:
val rows = data
.map { row ->
async(SpanContext.current + Dispatchers.Default) {
renderRow(row)
}
}
.awaitAll()
This will yield the following waterfall:
It’s worth changing the custom nested spans to be non “first-class” (so that they don’t clutter your project dashboard):
val NESTED_SPAN = SpanOptions.DEFAULTS
.setFirstClass(false)
private fun renderRow(row: RowInput): RowRenderResult {
BugsnagPerformance.startSpan("RenderRow", NESTED_SPAN).use {
// ...
Coroutines are the async
/ multithreaded
tool of choice in Kotlin. But sometimes one must make use of a Thread Pool
instead, especially when using Java libraries not focused on Android. Don’t worry – we’ve got you covered!
We have a couple of ThreadPoolExecutor
implementations (ContextAwareThreadPoolExecutor
and ScheduledThreadPoolExecutor
) which automatically carry the context onto the tasks that are submitted for execution.
val parentSpan = BugsnagPerformance.startSpan("parentSpan")
val executor = ContextAwareThreadPoolExecutor(2, 4, 10, TimeUnit.SECONDS, ArrayBlockingQueue(10))
executor.submit {
val childSpanOnAnotherThread = BugsnagPerformance.startSpan("childSpanOnAnotherThread")
}
val childSpanOnOriginalThread = BugsnagPerformance.startSpan("childSpanOnOriginalThread")
In the above example, both childSpanOnAnotherThread
and childSpanOnOriginalThread
will be children of parentSpan
.
You can also wrap any Runnable
and Callable
objects in the SpanContext
you want them to run in:
val parentSpan = BugsnagPerformance.startSpan("parentSpan")
mainHandler.post(SpanContext.current.wrap(Runnable {
// some work to do on the main thread
val childSpan = BugsnagPerformance.startSpan("childSpan")
}))
Here, childSpan
will be a child of parentSpan
, despite running on a different thread.
By starting with a lightweight notion of “context” we can easily and safely allow the propagation of span parenthood through traces, providing an intuitive representation of an often-complex, multi-threaded program flow in our waterfall diagrams.
This, coupled with helpers for the various standard ways operations can flow across threads in the Android app, gives the perfect balance of flexibility and ease of use. For the full documentation on the ability to maintain context in Android traces, see our docs. Want to try it for yourself? Try BugSnag free for 14 days, no credit card required.