1.0.13.BUILD-SNAPSHOT

Spring Cloud Sleuth implements a distributed tracing solution for Spring Cloud.

Terminology

Spring Cloud Sleuth borrows Dapper’s terminology.

Span: The basic unit of work. For example, sending an RPC is a new span, as is sending a response to an RPC. Span’s are identified by a unique 64-bit ID for the span and another 64-bit ID for the trace the span is a part of. Spans also have other data, such as descriptions, timestamped events, key-value annotations (tags), the ID of the span that caused them, and process ID’s (normally IP address).

Spans are started and stopped, and they keep track of their timing information. Once you create a span, you must stop it at some point in the future.

Tip
The initial span that starts a trace is called a root span. The value of span id of that span is equal to trace id.

Trace: A set of spans forming a tree-like structure. For example, if you are running a distributed big-data store, a trace might be formed by a put request.

Annotation: is used to record existence of an event in time. Some of the core annotations used to define the start and stop of a request are:

  • cs - Client Sent - The client has made a request. This annotation depicts the start of the span.

  • sr - Server Received - The server side got the request and will start processing it. If one subtracts the cs timestamp from this timestamp one will receive the network latency.

  • ss - Server Sent - Annotated upon completion of request processing (when the response got sent back to the client). If one subtracts the sr timestamp from this timestamp one will receive the time needed by the server side to process the request.

  • cr - Client Received - Signifies the end of the span. The client has successfully received the response from the server side. If one subtracts the cs timestamp from this timestamp one will receive the whole time needed by the client to receive the response from the server.

Visualization of what Span and Trace will look in a system together with the Zipkin annotations:

Trace Info propagation

Each color of a note signifies a span (7 spans - from A to G). If you have such information in the note:

Trace Id = X
Span Id = D
Client Sent

That means that the current span has Trace-Id set to X, Span-Id set to D. It also has emitted Client Sent event.

This is how the visualization of the parent / child relationship of spans would look like:

Parent child relationship

Purpose

In the following sections the example from the image above will be taken into consideration.

Distributed tracing with Zipkin

Altogether there are 7 spans . If you go to traces in Zipkin you will see this number in the second trace:

Traces

However if you pick a particular trace then you will see 4 spans:

Traces Info propagation
Note
When picking a particular trace you will see merged spans. That means that if there were 2 spans sent to Zipkin with Server Received and Server Sent / Client Received and Client Sent annotations then they will presented as a single span.

Why is there a difference between the 7 and 4 spans in this case?

  • 2 spans come from http:/start span. It has the Server Received (SR) and Server Sent (SS) annotations.

  • 2 spans come from the RPC call from service1 to service2 to the http:/foo endpoint. It has the Client Sent (CS) and Client Received (CR) annotations on service1 side. It also has Server Received (SR) and Server Sent (SS) annotations on the service2 side. Physically there are 2 spans but they form 1 logical span related to an RPC call.

  • 2 spans come from the RPC call from service2 to service3 to the http:/bar endpoint. It has the Client Sent (CS) and Client Received (CR) annotations on service2 side. It also has Server Received (SR) and Server Sent (SS) annotations on the service3 side. Physically there are 2 spans but they form 1 logical span related to an RPC call.

  • 2 spans come from the RPC call from service2 to service4 to the http:/baz endpoint. It has the Client Sent (CS) and Client Received (CR) annotations on service2 side. It also has Server Received (SR) and Server Sent (SS) annotations on the service4 side. Physically there are 2 spans but they form 1 logical span related to an RPC call.

So if we count the physical spans we have 1 from http:/start, 2 from service1 calling service2, 2 form service2 calling service3 and 2 from service2 calling service4. Altogether 7 spans.

Logically we see the information of Total Spans: 4 because we have 1 span related to the incoming request to service1 and 3 spans related to RPC calls.

Visualizing errors

Zipkin allows you to visualize errors in your trace. When an exception was thrown and wasn’t caught then we’re setting proper tags on the span which Zipkin can properly colorize. You could see in the list of traces one trace that was in red color. That’s because there was an exception thrown.

If you click that trace then you’ll see a similar picture

Error Traces

Then if you click on one of the spans you’ll see the following

Error Traces Info propagation

As you can see you can easily see the reason for an error and the whole stacktrace related to it.

Live examples

Zipkin deployed on Pivotal Web Services
Click Pivotal Web Services icon to see it live!Click Pivotal Web Services icon to see it live!

The dependency graph in Zipkin would look like this:

Dependencies
Zipkin deployed on Pivotal Web Services
Click Pivotal Web Services icon to see it live!Click Pivotal Web Services icon to see it live!

Log correlation

When grepping the logs of those four applications by trace id equal to e.g. 2485ec27856c56f4 one would get the following:

service1.log:2016-02-26 11:15:47.561  INFO [service1,2485ec27856c56f4,2485ec27856c56f4,true] 68058 --- [nio-8081-exec-1] i.s.c.sleuth.docs.service1.Application   : Hello from service1. Calling service2
service2.log:2016-02-26 11:15:47.710  INFO [service2,2485ec27856c56f4,9aa10ee6fbde75fa,true] 68059 --- [nio-8082-exec-1] i.s.c.sleuth.docs.service2.Application   : Hello from service2. Calling service3 and then service4
service3.log:2016-02-26 11:15:47.895  INFO [service3,2485ec27856c56f4,1210be13194bfe5,true] 68060 --- [nio-8083-exec-1] i.s.c.sleuth.docs.service3.Application   : Hello from service3
service2.log:2016-02-26 11:15:47.924  INFO [service2,2485ec27856c56f4,9aa10ee6fbde75fa,true] 68059 --- [nio-8082-exec-1] i.s.c.sleuth.docs.service2.Application   : Got response from service3 [Hello from service3]
service4.log:2016-02-26 11:15:48.134  INFO [service4,2485ec27856c56f4,1b1845262ffba49d,true] 68061 --- [nio-8084-exec-1] i.s.c.sleuth.docs.service4.Application   : Hello from service4
service2.log:2016-02-26 11:15:48.156  INFO [service2,2485ec27856c56f4,9aa10ee6fbde75fa,true] 68059 --- [nio-8082-exec-1] i.s.c.sleuth.docs.service2.Application   : Got response from service4 [Hello from service4]
service1.log:2016-02-26 11:15:48.182  INFO [service1,2485ec27856c56f4,2485ec27856c56f4,true] 68058 --- [nio-8081-exec-1] i.s.c.sleuth.docs.service1.Application   : Got response from service2 [Hello from service2, response from service3 [Hello from service3] and from service4 [Hello from service4]]

If you’re using a log aggregating tool like Kibana, Splunk etc. you can order the events that took place. An example of Kibana would look like this:

Log correlation with Kibana

If you want to use Logstash here is the Grok pattern for Logstash:

filter {
       # pattern matching logback pattern
       grok {
              match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}" }
       }
}
Note
If you want to use Grok together with the logs from Cloud Foundry you have to use this pattern:
filter {
       # pattern matching logback pattern
       grok {
              match => { "message" => "(?m)OUT\s+%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}" }
       }
}
JSON Logback with Logstash

Often you do not want to store your logs in a text file but in a JSON file that Logstash can immediately pick. To do that you have to do the following (for readability we’re passing the dependencies in the groupId:artifactId:version notation.

Dependencies setup

  • Ensure that Logback is on the classpath (ch.qos.logback:logback-core)

  • Add Logstash Logback encode - example for version 4.6 : net.logstash.logback:logstash-logback-encoder:4.6

Logback setup

Below you can find an example of a Logback configuration (file named logback-spring.xml) that:

  • logs information from the application in a JSON format to a build/${spring.application.name}.json file

  • has commented out two additional appenders - console and standard log file

  • has the same logging pattern as the one presented in the previous section

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
	<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
	​
	<springProperty scope="context" name="springAppName" source="spring.application.name"/>
	<!-- Example for logging into the build folder of your project -->
	<property name="LOG_FILE" value="${BUILD_FOLDER:-build}/${springAppName}"/>​

	<!-- You can override this to have a custom pattern -->
	<property name="CONSOLE_LOG_PATTERN"
			  value="%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(${LOG_LEVEL_PATTERN:-%5p}) %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}"/>

	<!-- Appender to log to console -->
	<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
		<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
			<!-- Minimum logging level to be presented in the console logs-->
			<level>DEBUG</level>
		</filter>
		<encoder>
			<pattern>${CONSOLE_LOG_PATTERN}</pattern>
			<charset>utf8</charset>
		</encoder>
	</appender>

	<!-- Appender to log to file -->​
	<appender name="flatfile" class="ch.qos.logback.core.rolling.RollingFileAppender">
		<file>${LOG_FILE}</file>
		<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
			<fileNamePattern>${LOG_FILE}.%d{yyyy-MM-dd}.gz</fileNamePattern>
			<maxHistory>7</maxHistory>
		</rollingPolicy>
		<encoder>
			<pattern>${CONSOLE_LOG_PATTERN}</pattern>
			<charset>utf8</charset>
		</encoder>
	</appender>
	​
	<!-- Appender to log to file in a JSON format -->
	<appender name="logstash" class="ch.qos.logback.core.rolling.RollingFileAppender">
		<file>${LOG_FILE}.json</file>
		<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
			<fileNamePattern>${LOG_FILE}.json.%d{yyyy-MM-dd}.gz</fileNamePattern>
			<maxHistory>7</maxHistory>
		</rollingPolicy>
		<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
			<providers>
				<timestamp>
					<timeZone>UTC</timeZone>
				</timestamp>
				<pattern>
					<pattern>
						{
						"severity": "%level",
						"service": "${springAppName:-}",
						"trace": "%X{X-B3-TraceId:-}",
						"span": "%X{X-B3-SpanId:-}",
						"parent": "%X{X-B3-ParentSpanId:-}",
						"exportable": "%X{X-Span-Export:-}",
						"pid": "${PID:-}",
						"thread": "%thread",
						"class": "%logger{40}",
						"rest": "%message"
						}
					</pattern>
				</pattern>
			</providers>
		</encoder>
	</appender>
	​
	<root level="INFO">
		<appender-ref ref="console"/>
		<!-- uncomment this to have also JSON logs -->
		<!--<appender-ref ref="logstash"/>-->
		<!--<appender-ref ref="flatfile"/>-->
	</root>
</configuration>
Note
If you’re using a custom logback-spring.xml then you have to pass the spring.application.name in bootstrap instead of application property file. Otherwise your custom logback file won’t read the property properly.

Adding to the project

Only Sleuth (log correlation)

If you want to profit only from Spring Cloud Sleuth without the Zipkin integration just add the spring-cloud-starter-sleuth module to your project.

Maven
<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-sleuth</artifactId>
   </dependency>
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-starter-sleuth

Gradle
dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies { (2)
    compile "org.springframework.cloud:spring-cloud-starter-sleuth"
}
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-starter-sleuth

Sleuth with Zipkin via HTTP

If you want both Sleuth and Zipkin just add the spring-cloud-starter-zipkin dependency.

Maven
<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-zipkin</artifactId>
   </dependency>
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-starter-zipkin

Gradle
dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies { (2)
    compile "org.springframework.cloud:spring-cloud-starter-zipkin"
}
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-starter-zipkin

Sleuth with Zipkin via Spring Cloud Stream

If you want both Sleuth and Zipkin just add the spring-cloud-sleuth-stream dependency.

Maven
<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-sleuth-stream</artifactId>
   </dependency>
   <dependency> (3)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-sleuth</artifactId>
   </dependency>
   <!-- EXAMPLE FOR RABBIT BINDING -->
   <dependency> (4)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-stream-binder-rabbit</artifactId>
   </dependency>
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-sleuth-stream

  3. Add the dependency to spring-cloud-starter-sleuth - that way all dependant dependencies will be downloaded

  4. Add a binder (e.g. Rabbit binder) to tell Spring Cloud Stream what it should bind to

Gradle
dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies {
    compile "org.springframework.cloud:spring-cloud-sleuth-stream" (2)
    compile "org.springframework.cloud:spring-cloud-starter-sleuth" (3)
    // Example for Rabbit binding
    compile "org.springframework.cloud:spring-cloud-stream-binder-rabbit" (4)
}
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-sleuth-stream

  3. Add the dependency to spring-cloud-starter-sleuth - that way all dependant dependencies will be downloaded

  4. Add a binder (e.g. Rabbit binder) to tell Spring Cloud Stream what it should bind to

Spring Cloud Sleuth Stream Zipkin Collector

If you want to start a Spring Cloud Sleuth Stream Zipkin collector just add the spring-cloud-sleuth-zipkin-stream dependency

Maven
<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-sleuth-zipkin-stream</artifactId>
   </dependency>
   <dependency> (3)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-sleuth</artifactId>
   </dependency>
   <!-- EXAMPLE FOR RABBIT BINDING -->
   <dependency> (4)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-stream-binder-rabbit</artifactId>
   </dependency>
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-sleuth-zipkin-stream

  3. Add the dependency to spring-cloud-starter-sleuth - that way all dependant dependencies will be downloaded

  4. Add a binder (e.g. Rabbit binder) to tell Spring Cloud Stream what it should bind to

Gradle
dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies {
    compile "org.springframework.cloud:spring-cloud-sleuth-zipkin-stream" (2)
    compile "org.springframework.cloud:spring-cloud-starter-sleuth" (3)
    // Example for Rabbit binding
    compile "org.springframework.cloud:spring-cloud-stream-binder-rabbit" (4)
}
  1. In order not to pick versions by yourself it’s much better if you add the dependency management via the Spring BOM

  2. Add the dependency to spring-cloud-sleuth-zipkin-stream

  3. Add the dependency to spring-cloud-starter-sleuth - that way all dependant dependencies will be downloaded

  4. Add a binder (e.g. Rabbit binder) to tell Spring Cloud Stream what it should bind to

and then just annotate your main class with @EnableZipkinStreamServer annotation:

package example;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.sleuth.zipkin.stream.EnableZipkinStreamServer;

@SpringBootApplication
@EnableZipkinStreamServer
public class ZipkinStreamServerApplication {

	public static void main(String[] args) throws Exception {
		SpringApplication.run(ZipkinStreamServerApplication.class, args);
	}

}

Additional resources

Marcin Grzejszczak talking about Spring Cloud Sleuth and Zipkin

Features

  • Adds trace and span ids to the Slf4J MDC, so you can extract all the logs from a given trace or span in a log aggregator. Example logs:

    2016-02-02 15:30:57.902  INFO [bar,6bfd228dc00d216b,6bfd228dc00d216b,false] 23030 --- [nio-8081-exec-3] ...
    2016-02-02 15:30:58.372 ERROR [bar,6bfd228dc00d216b,6bfd228dc00d216b,false] 23030 --- [nio-8081-exec-3] ...
    2016-02-02 15:31:01.936  INFO [bar,46ab0d418373cbc9,46ab0d418373cbc9,false] 23030 --- [nio-8081-exec-4] ...

    notice the [appname,traceId,spanId,exportable] entries from the MDC:

    • spanId - the id of a specific operation that took place

    • appname - the name of the application that logged the span

    • traceId - the id of the latency graph that contains the span

    • exportable - whether the log should be exported to Zipkin or not. When would you like the span not to be exportable? In the case in which you want to wrap some operation in a Span and have it written to the logs only.

  • Provides an abstraction over common distributed tracing data models: traces, spans (forming a DAG), annotations, key-value annotations. Loosely based on HTrace, but Zipkin (Dapper) compatible.

  • Sleuth records timing information to aid in latency analysis. Using sleuth, you can pinpoint causes of latency in your applications. Sleuth is written to not log too much, and to not cause your production application to crash.

    • propagates structural data about your call-graph in-band, and the rest out-of-band.

    • includes opinionated instrumentation of layers such as HTTP

    • includes sampling policy to manage volume

    • can report to a Zipkin system for query and visualization

  • Instruments common ingress and egress points from Spring applications (servlet filter, async endpoints, rest template, scheduled actions, message channels, zuul filters, feign client).

  • Sleuth includes default logic to join a trace across http or messaging boundaries. For example, http propagation works via Zipkin-compatible request headers. This propagation logic is defined and customized via SpanInjector and SpanExtractor implementations.

  • Provides simple metrics of accepted / dropped spans.

  • If spring-cloud-sleuth-zipkin then the app will generate and collect Zipkin-compatible traces. By default it sends them via HTTP to a Zipkin server on localhost (port 9411). Configure the location of the service using spring.zipkin.baseUrl.

  • If spring-cloud-sleuth-stream then the app will generate and collect traces via Spring Cloud Stream. Your app automatically becomes a producer of tracer messages that are sent over your broker of choice (e.g. RabbitMQ, Apache Kafka, Redis).

Important
If using Zipkin or Stream, configure the percentage of spans exported using spring.sleuth.sampler.percentage (default 0.1, i.e. 10%). Otherwise you might think that Sleuth is not working cause it’s omitting some spans.
Note
the SLF4J MDC is always set and logback users will immediately see the trace and span ids in logs per the example above. Other logging systems have to configure their own formatter to get the same result. The default is logging.pattern.level set to %clr(%5p) %clr([${spring.application.name:},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}]){yellow} (this is a Spring Boot feature for logback users). This means that if you’re not using SLF4J this pattern WILL NOT be automatically applied.

Sampling

In distributed tracing the data volumes can be very high so sampling can be important (you usually don’t need to export all spans to get a good picture of what is happening). Spring Cloud Sleuth has a Sampler strategy that you can implement to take control of the sampling algorithm. Samplers do not stop span (correlation) ids from being generated, but they do prevent the tags and events being attached and exported. By default you get a strategy that continues to trace if a span is already active, but new ones are always marked as non-exportable. If all your apps run with this sampler you will see traces in logs, but not in any remote store. For testing the default is often enough, and it probably is all you need if you are only using the logs (e.g. with an ELK aggregator). If you are exporting span data to Zipkin or Spring Cloud Stream, there is also an AlwaysSampler that exports everything and a PercentageBasedSampler that samples a fixed fraction of spans.

Note
the PercentageBasedSampler is the default if you are using spring-cloud-sleuth-zipkin or spring-cloud-sleuth-stream. You can configure the exports using spring.sleuth.sampler.percentage. The passed value needs to be a double from 0.0 to 1.0 so it’s not a percentage. For backwards compatibility reasons we’re not changing the property name.

A sampler can be installed just by creating a bean definition, e.g:

@Bean
public Sampler defaultSampler() {
	return new AlwaysSampler();
}

Instrumentation

Spring Cloud Sleuth instruments all your Spring application automatically, so you shouldn’t have to do anything to activate it. The instrumentation is added using a variety of technologies according to the stack that is available, e.g. for a servlet web application we use a Filter, and for Spring Integration we use ChannelInterceptors.

You can customize the keys used in span tags. To limit the volume of span data, by default an HTTP request will be tagged only with a handful of metadata like the status code, host and URL. You can add request headers by configuring spring.sleuth.keys.http.headers (a list of header names).

Note
Remember that tags are only collected and exported if there is a Sampler that allows it (by default there is not, so there is no danger of accidentally collecting too much data without configuring something).
Note
Currently the instrumentation in Spring Cloud Sleuth is eager - it means that we’re actively trying to pass the tracing context between threads. Also timing events are captured even when sleuth isn’t exporting data to a tracing system. This approach may change in the future towards being lazy on this matter.

Span lifecycle

You can do the following operations on the Span by means of org.springframework.cloud.sleuth.Tracer interface:

  • start - when you start a span its name is assigned and start timestamp is recorded.

  • close - the span gets finished (the end time of the span is recorded) and if the span is exportable then it will be eligible for collection to Zipkin. The span is also removed from the current thread.

  • continue - a new instance of span will be created whereas it will be a copy of the one that it continues.

  • detach - the span doesn’t get stopped or closed. It only gets removed from the current thread.

  • create with explicit parent - you can create a new span and set an explicit parent to it

Tip
Spring creates the instance of Tracer for you. In order to use it all you need is to just autowire it.

Creating and closing spans

You can manually create spans by using the Tracer interface.

// Start a span. If there was a span present in this thread it will become
// the `newSpan`'s parent.
Span newSpan = this.tracer.createSpan("calculateTax");
try {
	// ...
	// You can tag a span
	this.tracer.addTag("taxValue", taxValue);
	// ...
	// You can log an event on a span
	newSpan.logEvent("taxCalculated");
} finally {
	// Once done remember to close the span. This will allow collecting
	// the span to send it to Zipkin
	this.tracer.close(newSpan);
}

In this example we could see how to create a new instance of span. Assuming that there already was a span present in this thread then it would become the parent of that span.

Important
Always clean after you create a span! Don’t forget to close a span if you want to send it to Zipkin.

Continuing spans

Sometimes you don’t want to create a new span but you want to continue one. Example of such a situation might be (of course it all depends on the use-case):

  • AOP - If there was already a span created before an aspect was reached then you might not want to create a new span.

  • Hystrix - executing a Hystrix command is most likely a logical part of the current processing. It’s in fact only a technical implementation detail that you wouldn’t necessarily want to reflect in tracing as a separate being.

The continued instance of span is equal to the one that it continues:

Span continuedSpan = this.tracer.continueSpan(spanToContinue);
assertThat(continuedSpan).isEqualTo(spanToContinue);

To continue a span you can use the Tracer interface.

// let's assume that we're in a thread Y and we've received
// the `initialSpan` from thread X
Span continuedSpan = this.tracer.continueSpan(initialSpan);
try {
	// ...
	// You can tag a span
	this.tracer.addTag("taxValue", taxValue);
	// ...
	// You can log an event on a span
	continuedSpan.logEvent("taxCalculated");
} finally {
	// Once done remember to detach the span. That way you'll
	// safely remove it from the current thread without closing it
	this.tracer.detach(continuedSpan);
}
Important
Always clean after you create a span! Don’t forget to detach a span if some work was done started in one thread (e.g. thread X) and it’s waiting for other threads (e.g. Y, Z) to finish. Then the spans in the threads Y, Z should be detached at the end of their work. When the results are collected the span in thread X should be closed.

Creating spans with an explicit parent

There is a possibility that you want to start a new span and provide an explicit parent of that span. Let’s assume that the parent of a span is in one thread and you want to start a new span in another thread. The startSpan method of the Tracer interface is the method you are looking for.

// let's assume that we're in a thread Y and we've received
// the `initialSpan` from thread X. `initialSpan` will be the parent
// of the `newSpan`
Span newSpan = this.tracer.createSpan("calculateCommission", initialSpan);
try {
	// ...
	// You can tag a span
	this.tracer.addTag("commissionValue", commissionValue);
	// ...
	// You can log an event on a span
	newSpan.logEvent("commissionCalculated");
} finally {
	// Once done remember to close the span. This will allow collecting
	// the span to send it to Zipkin. The tags and events set on the
	// newSpan will not be present on the parent
	this.tracer.close(newSpan);
}
Important
After having created such a span remember to close it. Otherwise you will see a lot of warnings in your logs related to the fact that you have a span present in the current thread other than the one you’re trying to close. What’s worse your spans won’t get closed properly thus will not get collected to Zipkin.

Naming spans

Picking a span name is not a trivial task. Span name should depict an operation name. The name should be low cardinality (e.g. not include identifiers).

Since there is a lot of instrumentation going on some of the span names will be artificial like:

  • controller-method-name when received by a Controller with a method name conrollerMethodName

  • async for asynchronous operations done via wrapped Callable and Runnable.

  • @Scheduled annotated methods will return the simple name of the class.

Fortunately, for the asynchronous processing you can provide explicit naming.

@SpanName annotation

You can do name the span explicitly via the @SpanName annotation.

@SpanName("calculateTax")
class TaxCountingRunnable implements Runnable {

	@Override public void run() {
		// perform logic
	}
}

In this case, when processed in the following manner:

Runnable runnable = new TraceRunnable(tracer, spanNamer, new TaxCountingRunnable());
Future<?> future = executorService.submit(runnable);
// ... some additional logic ...
future.get();

The span will be named calculateTax.

toString() method

It’s pretty rare to create separate classes for Runnable or Callable. Typically one creates an anonymous instance of those classes. You can’t annotate such classes thus to override that, if there is no @SpanName annotation present, we’re checking if the class has a custom implementation of the toString() method.

So executing such code:

Runnable runnable = new TraceRunnable(tracer, spanNamer, new Runnable() {
	@Override public void run() {
		// perform logic
	}

	@Override public String toString() {
		return "calculateTax";
	}
});
Future<?> future = executorService.submit(runnable);
// ... some additional logic ...
future.get();

will lead in creating a span named calculateTax.

Customizations

Thanks to the SpanInjector and SpanExtractor you can customize the way spans are created and propagated.

There are currently two built-in ways to pass tracing information between processes:

  • via Spring Integration

  • via HTTP

Span ids are extracted from Zipkin-compatible (B3) headers (either Message or HTTP headers), to start or join an existing trace. Trace information is injected into any outbound requests so the next hop can extract them.

Spring Integration

For Spring Integration these are the beans responsible for creation of a Span from a Message and filling in the MessageBuilder with tracing information.

@Bean
public SpanExtractor<Message> messagingSpanExtractor() {
    ...
}

@Bean
public SpanInjector<MessageBuilder> messagingSpanInjector() {
    ...
}

You can override them by providing your own implementation and by adding a @Primary annotation to your bean definition.

HTTP

For HTTP these are the beans responsible for creation of a Span from a HttpServletRequest.

@Bean
public SpanExtractor<HttpServletRequest> httpServletRequestSpanExtractor() {
    ...
}

You can override them by providing your own implementation and by adding a @Primary annotation to your bean definition.

Example

Let’s assume that instead of the standard Zipkin compatible tracing HTTP header names you have

  • for trace id - correlationId

  • for span id - mySpanId

This is a an example of a SpanExtractor

static class CustomHttpServletRequestSpanExtractor
		implements SpanExtractor<HttpServletRequest> {

	@Override
	public Span joinTrace(HttpServletRequest carrier) {
		long traceId = Span.hexToId(carrier.getHeader("correlationId"));
		long spanId = Span.hexToId(carrier.getHeader("mySpanId"));
		// extract all necessary headers
		Span.SpanBuilder builder = Span.builder().traceId(traceId).spanId(spanId);
		// build rest of the Span
		return builder.build();
	}
}

And you could register it like this:

@Bean
@Primary
SpanExtractor<HttpServletRequest> customHttpServletRequestSpanExtractor() {
	return new CustomHttpServletRequestSpanExtractor();
}

Spring Cloud Sleuth does not add trace/span related headers to the Http Response for security reasons. If you need the headers then a custom SpanInjector that injects the headers into the Http Response and a Servlet filter which makes use of this can be added the following way:

static class CustomHttpServletResponseSpanInjector
		implements SpanInjector<HttpServletResponse> {

	@Override
	public void inject(Span span, HttpServletResponse carrier) {
		carrier.addHeader(Span.TRACE_ID_NAME, span.traceIdString());
		carrier.addHeader(Span.SPAN_ID_NAME, Span.idToHex(span.getSpanId()));
	}
}

static class HttpResponseInjectingTraceFilter extends GenericFilterBean {

	private final Tracer tracer;
	private final SpanInjector<HttpServletResponse> spanInjector;

	public HttpResponseInjectingTraceFilter(Tracer tracer, SpanInjector<HttpServletResponse> spanInjector) {
		this.tracer = tracer;
		this.spanInjector = spanInjector;
	}

	@Override
	public void doFilter(ServletRequest request, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException {
		HttpServletResponse response = (HttpServletResponse) servletResponse;
		Span currentSpan = this.tracer.getCurrentSpan();
		this.spanInjector.inject(currentSpan, response);
		filterChain.doFilter(request, response);
	}
}

And you could register them like this:

@Bean
SpanInjector<HttpServletResponse> customHttpServletResponseSpanInjector() {
	return new CustomHttpServletResponseSpanInjector();
}

@Bean
HttpResponseInjectingTraceFilter responseInjectingTraceFilter(Tracer tracer) {
	return new HttpResponseInjectingTraceFilter(tracer, customHttpServletResponseSpanInjector());
}

Custom SA tag in Zipkin

Sometimes you want to create a manual Span that will wrap a call to an external service which is not instrumented. What you can do is to create a span with the peer.service tag that will contain a value of the service that you want to call. Below you can see an example of a call to Redis that is wrapped in such a span.

org.springframework.cloud.sleuth.Span newSpan = tracer.createSpan("redis");
try {
	newSpan.tag("redis.op", "get");
	newSpan.tag("lc", "redis");
	newSpan.logEvent(org.springframework.cloud.sleuth.Span.CLIENT_SEND);
	// call redis service e.g
	// return (SomeObj) redisTemplate.opsForHash().get("MYHASH", someObjKey);
} finally {
	newSpan.tag("peer.service", "redisService");
	newSpan.tag("peer.ipv4", "1.2.3.4");
	newSpan.tag("peer.port", "1234");
	newSpan.logEvent(org.springframework.cloud.sleuth.Span.CLIENT_RECV);
	tracer.close(newSpan);
}
Important
Remember not to add both peer.service tag and the SA tag! You have to add only peer.service.

Custom service name

By default Sleuth assumes that when you send a span to Zipkin, you want the span’s service name to be equal to spring.application.name value. That’s not always the case though. There are situations in which you want to explicitly provide a different service name for all spans coming from your application. To achieve that it’s enough to just pass the following property to your application to override that value (example for foo service name):

spring.zipkin.service.name: foo

Host locator

In order to define the host that is corresponding to a particular span we need to resolve the host name and port. The default approach is to take it from server properties. If those for some reason are not set then we’re trying to retrieve the host name from the network interfaces.

If you have the discovery client enabled and prefer to retrieve the host address from the registered instance in a service registry then you have to set the property (it’s applicable for both HTTP and Stream based span reporting).

spring.zipkin.locator.discovery.enabled: true

Span Data as Messages

You can accumulate and send span data over Spring Cloud Stream by including the spring-cloud-sleuth-stream jar as a dependency, and adding a Channel Binder implementation (e.g. spring-cloud-starter-stream-rabbit for RabbitMQ or spring-cloud-starter-stream-kafka for Kafka). This will automatically turn your app into a producer of messages with payload type Spans.

Zipkin Consumer

There is a special convenience annotation for setting up a message consumer for the Span data and pushing it into a Zipkin SpanStore. This application

@SpringBootApplication
@EnableZipkinStreamServer
public class Consumer {
	public static void main(String[] args) {
		SpringApplication.run(Consumer.class, args);
	}
}

will listen for the Span data on whatever transport you provide via a Spring Cloud Stream Binder (e.g. include spring-cloud-starter-stream-rabbit for RabbitMQ, and similar starters exist for Redis and Kafka). If you add the following UI dependency

<groupId>io.zipkin.java</groupId>
<artifactId>zipkin-autoconfigure-ui</artifactId>

Then you’ll have your app a Zipkin server, which hosts the UI and api on port 9411.

The default SpanStore is in-memory (good for demos and getting started quickly). For a more robust solution you can add MySQL and spring-boot-starter-jdbc to your classpath and enable the JDBC SpanStore via configuration, e.g.:

spring:
  rabbitmq:
    host: ${RABBIT_HOST:localhost}
  datasource:
    schema: classpath:/mysql.sql
    url: jdbc:mysql://${MYSQL_HOST:localhost}/test
    username: root
    password: root
# Switch this on to create the schema on startup:
    initialize: true
    continueOnError: true
  sleuth:
    enabled: false
zipkin:
  storage:
    type: mysql
Note
The @EnableZipkinStreamServer is also annotated with @EnableZipkinServer so the process will also expose the standard Zipkin server endpoints for collecting spans over HTTP, and for querying in the Zipkin Web UI.

Custom Consumer

A custom consumer can also easily be implemented using spring-cloud-sleuth-stream and binding to the SleuthSink. Example:

@EnableBinding(SleuthSink.class)
@SpringBootApplication(exclude = SleuthStreamAutoConfiguration.class)
@MessageEndpoint
public class Consumer {

    @ServiceActivator(inputChannel = SleuthSink.INPUT)
    public void sink(Spans input) throws Exception {
        // ... process spans
    }
}
Note
the sample consumer application above explicitly excludes SleuthStreamAutoConfiguration so it doesn’t send messages to itself, but this is optional (you might actually want to trace requests into the consumer app).

Metrics

Currently Spring Cloud Sleuth registers very simple metrics related to spans. It’s using the Spring Boot’s metrics support to calculate the number of accepted and dropped spans. Each time a span gets sent to Zipkin the number of accepted spans will increase. If there’s an error then the number of dropped spans will get increased.

Integrations

Runnable and Callable

If you’re wrapping your logic in Runnable or Callable it’s enough to wrap those classes in their Sleuth representative.

Example for Runnable:

Runnable runnable = new Runnable() {
	@Override
	public void run() {
		// do some work
	}

	@Override
	public String toString() {
		return "spanNameFromToStringMethod";
	}
};
// Manual `TraceRunnable` creation with explicit "calculateTax" Span name
Runnable traceRunnable = new TraceRunnable(tracer, spanNamer, runnable, "calculateTax");
// Wrapping `Runnable` with `Tracer`. The Span name will be taken either from the
// `@SpanName` annotation or from `toString` method
Runnable traceRunnableFromTracer = tracer.wrap(runnable);

Example for Callable:

Callable<String> callable = new Callable<String>() {
	@Override
	public String call() throws Exception {
		return someLogic();
	}

	@Override
	public String toString() {
		return "spanNameFromToStringMethod";
	}
};
// Manual `TraceCallable` creation with explicit "calculateTax" Span name
Callable<String> traceCallable = new TraceCallable<>(tracer, spanNamer, callable, "calculateTax");
// Wrapping `Callable` with `Tracer`. The Span name will be taken either from the
// `@SpanName` annotation or from `toString` method
Callable<String> traceCallableFromTracer = tracer.wrap(callable);

That way you will ensure that a new Span is created and closed for each execution.

Hystrix

Custom Concurrency Strategy

We’re registering a custom HystrixConcurrencyStrategy that wraps all Callable instances into their Sleuth representative - the TraceCallable. The strategy either starts or continues a span depending on the fact whether tracing was already going on before the Hystrix command was called. To disable the custom Hystrix Concurrency Strategy set the spring.sleuth.hystrix.strategy.enabled to false.

Manual Command setting

Assuming that you have the following HystrixCommand:

HystrixCommand<String> hystrixCommand = new HystrixCommand<String>(setter) {
	@Override
	protected String run() throws Exception {
		return someLogic();
	}
};

In order to pass the tracing information you have to wrap the same logic in the Sleuth version of the HystrixCommand which is the TraceCommand:

TraceCommand<String> traceCommand = new TraceCommand<String>(tracer, traceKeys, setter) {
	@Override
	public String doRun() throws Exception {
		return someLogic();
	}
};

RxJava

We’re registering a custom RxJavaSchedulersHook that wraps all Action0 instances into their Sleuth representative - the TraceAction. The hook either starts or continues a span depending on the fact whether tracing was already going on before the Action was scheduled. To disable the custom RxJavaSchedulersHook set the spring.sleuth.rxjava.schedulers.hook.enabled to false.

You can define a list of regular expressions for thread names, for which you don’t want a Span to be created. Just provide a comma separated list of regular expressions in the spring.sleuth.rxjava.schedulers.ignoredthreads property.

HTTP integration

Features from this section can be disabled by providing the spring.sleuth.web.enabled property with value equal to false.

HTTP Filter

Via the TraceFilter all sampled incoming requests result in creation of a Span. That Span’s name is http: + the path to which the request was sent. E.g. if the request was sent to /foo/bar then the name will be http:/foo/bar. You can configure which URIs you would like to skip via the spring.sleuth.web.skipPattern property. If you have ManagementServerProperties on classpath then its value of contextPath gets appended to the provided skip pattern.

HandlerInterceptor

Since we want the span names to be precise we’re using a TraceHandlerInterceptor that either wraps an existing HandlerInterceptor or is added directly to the list of existing HandlerInterceptors. The TraceHandlerInterceptor adds a special request attribute to the given HttpServletRequest. If the the TraceFilter doesn’t see this attribute set it will create a "fallback" span which is an additional span created on the server side so that the trace is presented properly in the UI. Seeing that most likely signifies that there is a missing instrumentation. In that case please file an issue in Spring Cloud Sleuth.

Async Servlet support

If your controller returns a Callable or a WebAsyncTask Spring Cloud Sleuth will continue the existing span instead of creating a new one.

HTTP client integration

Synchronous Rest Template

We’re injecting a RestTemplate interceptor that ensures that all the tracing information is passed to the requests. Each time a call is made a new Span is created. It gets closed upon receiving the response. In order to block the synchronous RestTemplate features just set spring.sleuth.web.client.enabled to false.

Important
You have to register RestTemplate as a bean so that the interceptors will get injected. If you create a RestTemplate instance with a new keyword then the instrumentation WILL NOT work.

Asynchronous Rest Template

Important
A traced version of an AsyncRestTemplate bean is registered for you out of the box. If you have your own bean you have to wrap it in a TraceAsyncRestTemplate representation. The best solution is to only customize the ClientHttpRequestFactory and / or AsyncClientHttpRequestFactory. If you have your own AsyncRestTemplate and you don’t wrap it your calls WILL NOT GET TRACED.

Custom instrumentation is set to create and close Spans upon sending and receiving requests. You can customize the ClientHttpRequestFactory and the AsyncClientHttpRequestFactory by registering your beans. Remember to use tracing compatible implementations (e.g. don’t forget to wrap ThreadPoolTaskScheduler in a TraceAsyncListenableTaskExecutor). Example of custom request factories:

@EnableAutoConfiguration
@Configuration
public static class TestConfiguration {

	@Bean
	ClientHttpRequestFactory mySyncClientFactory() {
		return new MySyncClientHttpRequestFactory();
	}

	@Bean
	AsyncClientHttpRequestFactory myAsyncClientFactory() {
		return new MyAsyncClientHttpRequestFactory();
	}
}

To block the AsyncRestTemplate features set spring.sleuth.web.async.client.enabled to false. To disable creation of the default TraceAsyncClientHttpRequestFactoryWrapper set spring.sleuth.web.async.client.factory.enabled to false. If you don’t want to create AsyncRestClient at all set spring.sleuth.web.async.client.template.enabled to false.

Feign

By default Spring Cloud Sleuth provides integration with feign via the TraceFeignClientAutoConfiguration. You can disable it entirely by setting spring.sleuth.feign.enabled to false. If you do so then no Feign related instrumentation will take place.

Part of Feign instrumentation is done via a FeignBeanPostProcessor. You can disable it by providing the spring.sleuth.feign.processor.enabled equal to false. If you set it like this then Spring Cloud Sleuth will not instrument any of your custom Feign components. All the default instrumentation however will be still there.

Asynchronous communication

@Async annotated methods

In Spring Cloud Sleuth we’re instrumenting async related components so that the tracing information is passed between threads. You can disable this behaviour by setting the value of spring.sleuth.async.enabled to false.

If you annotate your method with @Async then we’ll automatically create a new Span with the following characteristics:

  • the Span name will be the annotated method name

  • the Span will be tagged with that method’s class name and the method name too

@Scheduled annotated methods

In Spring Cloud Sleuth we’re instrumenting scheduled method execution so that the tracing information is passed between threads. You can disable this behaviour by setting the value of spring.sleuth.scheduled.enabled to false.

If you annotate your method with @Scheduled then we’ll automatically create a new Span with the following characteristics:

  • the Span name will be the annotated method name

  • the Span will be tagged with that method’s class name and the method name too

If you want to skip Span creation for some @Scheduled annotated classes you can set the spring.sleuth.scheduled.skipPattern with a regular expression that will match the fully qualified name of the @Scheduled annotated class.

Tip
If you are using spring-cloud-sleuth-stream and spring-cloud-netflix-hystrix-stream together, Span will be created for each Hystrix metrics and sent to Zipkin. This may be annoying. You can prevent this by setting spring.sleuth.scheduled.skipPattern=org.springframework.cloud.netflix.hystrix.stream.HystrixStreamTask

Executor, ExecutorService and ScheduledExecutorService

We’re providing LazyTraceExecutor, TraceableExecutorService and TraceableScheduledExecutorService. Those implementations are creating Spans each time a new task is submitted, invoked or scheduled.

Here you can see an example of how to pass tracing information with TraceableExecutorService when working with CompletableFuture:

CompletableFuture<Long> completableFuture = CompletableFuture.supplyAsync(() -> {
	// perform some logic
	return 1_000_000L;
}, new TraceableExecutorService(executorService,
		// 'calculateTax' explicitly names the span - this param is optional
		tracer, traceKeys, spanNamer, "calculateTax"));

Messaging

Spring Cloud Sleuth integrates with Spring Integration. It creates spans for publish and subscribe events. To disable Spring Integration instrumentation, set spring.sleuth.integration.enabled to false.

Spring Cloud Sleuth up till version 1.0.4 is sending invalid tracing headers when using messaging. Those headers are actually the same as the ones sent in HTTP (they contain a -) in its name. For the sake of backwards compatibility in 1.0.4 we’ve started sending both valid and invalid headers. Please upgrade to 1.0.4 because in Spring Cloud Sleuth 1.1 we will remove the support for the deprecated headers.

Since 1.0.4 you can provide the spring.sleuth.integration.patterns pattern to explicitly provide the names of channels that you want to include for tracing. By default all channels are included.

Zuul

We’re registering Zuul filters to propagate the tracing information (the request header is enriched with tracing data). To disable Zuul support set the spring.sleuth.zuul.enabled property to false.

Running examples

You can find the running examples deployed in the Pivotal Web Services. Check them out in the following links: