Sleuth--Link Tracking

2024-07-12

1 Introduction to Link Tracking

In the microservice construction of large systems, a system is split into many modules. These modules are responsible for different functions, and combined into a system, they can ultimately provide rich functions. In this architecture, a request often involves multiple services. Internet applications are built on different sets of software modules. These software modules may be developed by different teams, implemented using different programming languages, and deployed on thousands of servers across multiple data centers, which means that this type of architecture will also have some problems:

How to find problems quickly?
How to determine the impact scope of the fault?
How to sort out service dependencies and the rationality of dependencies?
How to analyze link performance issues and plan capacity in real time?

Distributed Tracing is to restore a distributed request to a call link, log it, monitor its performance, and centrally display the call status of a distributed request, such as the time consumed on each service node, which machine the request reaches, the request status of each service node, etc.

Common link tracking technologies include the following:

cat Open sourced by Dianping.com, based onJavaThe real-time application monitoring platform developed by us includes real-time application monitoring and business monitoring.The solution is to implement monitoring through code embedding, such as: interceptors, filters, etc. It is very invasive to the code and requires integration.The cost is high. The risk is high.
zipkin Depend onTwitterThe company's open source, open source distributed tracking system is used to collect service timing data to solve microThe latency problem in the service architecture includes: data collection, storage, search and presentation.spring-cloud-sleuthIt is easy to use and integrate, but the functions are simple.
pinpoint PinpointIt is a Korean open source call chain analysis and application monitoring analysis tool based on bytecode injection. FeaturesIt supports multiple plug-ins.UIPowerful functions, no code intrusion on the access end.
skywalking
SkyWalkingIt is a local open source call chain analysis and application monitoring analysis tool based on bytecode injection.Plug-ins,UIThe function is strong, and there is no code intrusion on the access end.ApacheIncubator.
Sleuth
SpringCloud Provides link tracking solutions in distributed systems.

Notice: SpringCloud alibaba The technology stack does not provide its own link tracking technology, we can use Sleuth +

Zinkin To provide link tracking solutions

2 Sleuthgetting Started

2.1 Sleuthintroduce

The main function of SpringCloud Sleuth is to provide tracing solutions in distributed systems. Google

Dapper Let’s take a look at the design first. Sleuth Terms and related concepts in.

Trace
By a groupTrace IdidenticalSpanIn series, a tree structure is formed. In order to implement request tracking, when a request reaches the distributed systemWhen the service is used as an entry endpoint, the service tracking framework only needs to create a unique identifier for the request (i.e.TraceId), while in distributed systemsWhen the system circulates internally, the framework always keeps passing the unique value until the entire request is returned. Then we can use the unique valueAn identifier connects all requests in series to form a complete request chain.
Span Represents a set of basic work units. In order to count the latency of each processing unit, when the request reaches each service componentWhen, also through a unique identifier (SpanId) to mark its beginning, specific process and end.SpanIdThe beginning and endThe timestamp of the bundle can be used to count thespanIn addition, we can also obtain the name of the event, request information, etc.Metadata.
Annotation
Use it to record events over a period of time. Important notes for internal use:
cs（Client Send) The client sends a request, starting the life of a request
sr（Server Received) The server receives the request and starts processing it. sr－cs = Network latency (time to call the service)
ss（Server Send) The server has finished processing and is ready to send to the client.ss - sr = Request processing time on the server
cr（Client Reveived) The client receives the response from the server and the request ends. cr - sr = Total time of the request

2.2 Sleuthgetting Started

Microservice Name , traceId, spanid, Whether to output the link tracking results to a third-party platform

[api-gateway,3977125f73391553,3977125f73391553,false]

[service-order,3977125f73391553,57547b5bf71f8242,false]

[service-product,3977125f73391553,449f5b3f3ef8d5c5,false]

Next, we will integrate the previous project cases Sleuth , complete the writing of the introductory case.

Modify the parent project to introduce Sleuth rely


        <!--链路追踪 Sleuth-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-sleuth</artifactId>
        </dependency>

After starting the microservice and calling the interface, we can observe in the console sleuth Log output of

where c61d4a753370cbeb is TraceId , 0e06445e055ed94f is SpanId, calls in sequence have a global TraceId, which strings together the call links. Carefully analyzing the logs of each microservice, it is not difficult to see the specific process of the request.

Viewing log files is not a good method. As the number of microservices increases, the number of log files will also increase. Zipkin can be used to aggregate logs and perform visual display and full-text retrieval.

3 ZipkinIntegration

3.1 ZipKinintroduce

Zipkin is Twitter An open source project based on Google Dapper implementation, which is dedicated to collecting service timing data to solve latency problems in microservice architectures, including data Collect, store, search and display 。

We can use it to collect the tracking data of the request links on each server, and use the REST API interface it provides to assist us in querying the tracking data to implement the monitoring program of the distributed system, so as to promptly discover the delay increase problem in the system and find the root cause of the system performance bottleneck.

In addition to the development-oriented API In addition to the interface, it also provides convenient UI components help us intuitively search for tracking information and analyze request link details. For example, we can query the processing time of each user request within a certain period of time.

Zipkin provides pluggable data storage: In-Memory 、 MySql 、 Cassandra as well as Elasticsearch 。

The above picture shows Zipkin The infrastructure consists mainly of 4 The core components are:

Collector: Collector component, which is mainly used to process the tracking information sent from the external system and convert the information into ZipkinInternally processed Span format to support subsequent storage, analysis, display and other functions.
Storage: Storage component, which mainly processes the tracking information received by the collector and stores this information in memory by default.We can also modify this storage strategy by using other storage components to store the tracking information in the database.
RESTful API：API Components are mainly used to provide external access interfaces. For example, to display tracking information to the client, or to connect to externalSystem access for monitoring, etc.
Web UI：UI Components, based onAPIThe upper layer application implemented by the component.UIComponent users can easily and intuitively query and analyzeAnalyze tracking information.

Zipkin is divided into two ends, one is Zipkin On the server side, one is Zipkin client, the client is the application of microservice. The client will configure the server URL The address will be configured in the microservice once a call between services occurs. Sleuth's listener listens and generates the corresponding Trace and Span The information is sent to the server.

3.2 ZipKinServer Installation

No. 1 step : download ZipKin of jar Bag

https://search.maven.org/remote_content?g=io.zipkin.java&a=zipkin-server&v=LATEST&c=exec

Visit the above URL to get ajarPackage, this isZipKinServerjarBag

No. 2 step : Start the command by entering the following command through the command line ZipKin Server

java -jar zipkin-server-2.12.9-exec.jar

No. 3 Step 1: Access through browser http://localhost:9411 access

3.3 ZipkinClient Integration

ZipKin Client and Sleuth The integration is very simple, just add its dependencies and configuration in the microservice.

No. 1 Step 1: Add dependencies to each microservice


        <!--zipkin-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-zipkin</artifactId>
        </dependency>

No. 2 Step 1: Add configuration


  # sleuth 和 zipkin 相关配置
spring:
  zipkin:
    base-url: http://127.0.0.1:9411/ #zipkin server的请求地址
    discoveryClientEnabled: false #让nacos把它当成一个URL，而不要当做服务名
  sleuth:
    sampler:
      probability: 1.0 #采样的百分比

No. 3 step : Accessing microservice interfaces

http://localhost:7000/order-serv/order/prod/1

No. 4 step : access zipkin of UI Interface, observe the effect

No. 5 Step 1: Click on a record to view the detailed route of a visit.

4 ZipKinData persistence

Zipkin Server saves the tracking data information to memory by default, but this method is not suitable for production environments. Zipkin supports persisting tracing data to mysql Database or elasticsearch middle.

4.1 usemysqlImplementing data persistence

No. 1 step : create mysql Data Environment


CREATE TABLE
IF
	NOT EXISTS zipkin_spans (
		`trace_id_high` BIGINT NOT NULL DEFAULT 0 COMMENT 'If non zero, this
		means the trace uses 128 bit traceIds instead of 64 bit',
		`trace_id` BIGINT NOT NULL,
		`id` BIGINT NOT NULL,
		`name` VARCHAR ( 255 ) NOT NULL,
		`parent_id` BIGINT,
		`debug` BIT ( 1 ),
		`start_ts` BIGINT COMMENT 'Span.timestamp(): epoch micros used for endTs
		query and to implement TTL',
		`duration` BIGINT COMMENT 'Span.duration(): micros used for minDuration
		and maxDuration query' 
	) ENGINE = INNODB ROW_FORMAT = COMPRESSED CHARACTER 
	SET = utf8 COLLATE utf8_general_ci;
ALTER TABLE zipkin_spans ADD UNIQUE KEY ( `trace_id_high`, `trace_id`, `id` ) COMMENT 'ignore insert on duplicate';
ALTER TABLE zipkin_spans ADD INDEX ( `trace_id_high`, `trace_id`, `id` ) COMMENT 'for joining with zipkin_annotations';
ALTER TABLE zipkin_spans ADD INDEX ( `trace_id_high`, `trace_id` ) COMMENT 'for
getTracesByIds';
ALTER TABLE zipkin_spans ADD INDEX ( `name` ) COMMENT 'for getTraces and
getSpanNames';
ALTER TABLE zipkin_spans ADD INDEX ( `start_ts` ) COMMENT 'for getTraces
ordering and range';
CREATE TABLE
IF
	NOT EXISTS zipkin_annotations (
		`trace_id_high` BIGINT NOT NULL DEFAULT 0 COMMENT 'If non zero, this
		means the trace uses 128 bit traceIds instead of 64 bit',
		`trace_id` BIGINT NOT NULL COMMENT 'coincides with
		zipkin_spans.trace_id',
		`span_id` BIGINT NOT NULL COMMENT 'coincides with zipkin_spans.id',
		`a_key` VARCHAR ( 255 ) NOT NULL COMMENT 'BinaryAnnotation.key or
		Annotation.value if type == -1',
		`a_value` BLOB COMMENT 'BinaryAnnotation.value(), which must be smaller
		than 64KB',
		`a_type` INT NOT NULL COMMENT 'BinaryAnnotation.type() or -1 if
		Annotation',
		`a_timestamp` BIGINT COMMENT 'Used to implement TTL;
	Annotation.timestamp or zipkin_spans.timestamp',
	`endpoint_ipv4` INT COMMENT 'Null when Binary/Annotation.endpoint is
	null',
	`endpoint_ipv6` BINARY ( 16 ) COMMENT 'Null when Binary/Annotation.endpoint
	is null, or no IPv6 address',
	`endpoint_port` SMALLINT COMMENT 'Null when Binary/Annotation.endpoint
	is null',
	`endpoint_service_name` VARCHAR ( 255 ) COMMENT 'Null when
	Binary/Annotation.endpoint is null' 
) ENGINE = INNODB ROW_FORMAT = COMPRESSED CHARACTER 
SET = utf8 COLLATE utf8_general_ci;
ALTER TABLE zipkin_annotations ADD UNIQUE KEY ( `trace_id_high`, `trace_id`, `span_id`, `a_key`, `a_timestamp` ) COMMENT 'Ignore insert on duplicate';
ALTER TABLE zipkin_annotations ADD INDEX ( `trace_id_high`, `trace_id`, `span_id` ) COMMENT 'for joining with zipkin_spans';
ALTER TABLE zipkin_annotations ADD INDEX ( `trace_id_high`, `trace_id` ) COMMENT 'for getTraces/ByIds';
ALTER TABLE zipkin_annotations ADD INDEX ( `endpoint_service_name` ) COMMENT 'for getTraces and getServiceNames';
ALTER TABLE zipkin_annotations ADD INDEX ( `a_type` ) COMMENT 'for getTraces';
ALTER TABLE zipkin_annotations ADD INDEX ( `a_key` ) COMMENT 'for getTraces';
ALTER TABLE zipkin_annotations ADD INDEX ( `trace_id`, `span_id`, `a_key` ) COMMENT 'for dependencies job';
CREATE TABLE
IF
	NOT EXISTS zipkin_dependencies ( `day` DATE NOT NULL, `parent` VARCHAR ( 255 ) NOT NULL, `child` VARCHAR ( 255 ) NOT NULL, `call_count` BIGINT ) ENGINE = INNODB ROW_FORMAT = COMPRESSED CHARACTER 
	SET = utf8 COLLATE utf8_general_ci;
ALTER TABLE zipkin_dependencies ADD UNIQUE KEY ( `day`, `parent`, `child` );

No. 2 step : At startup ZipKin Server when , Specify the data to be saved mysql Information


java -jar zipkin-server-2.12.9-exec.jar --STORAGE_TYPE=mysql --
MYSQL_HOST=127.0.0.1 --MYSQL_TCP_PORT=3306 --MYSQL_DB=zipkin --MYSQL_USER=root -
-MYSQL_PASS=root

4.2 useelasticsearchImplementing data persistence

No. 1 step : download elasticsearch

download link: https://www.elastic.co/cn/downloads/past-releases/elasticsearch-6-8-4

No. 2 step : start up elasticsearch

Visit: localhost:9200

If you need visualization, you can install Kibana

No. 3 step : At startup ZipKin Server When specifying the data to be saved elasticsearch Information

java -jar zipkin-server-2.12.9-exec.jar --STORAGE_TYPE=elasticsearch --ESHOST=localhost:9200

As for the visualization of elasticsearch, you can refer to:Installation and use of Elasticsearch under Windows, and installation of kibana_windowskibanna-CSDN blog

Technology Sharing