自从上了 SkyWalking,睡觉真香!!
共 6336字,需浏览 13分钟
· 2020-12-30
Java技术栈
www.javastack.cn
关注阅读更多优质文章
作者:废物大师兄
来源:www.cnblogs.com/cjsblog/p/14075486.html
SkyWalking 是一个应用性能监控系统,特别为微服务、云原生和基于容器(Docker, Kubernetes, Mesos)体系结构而设计。
除了应用指标监控以外,它还能对分布式调用链路进行追踪。类似功能的组件还有:Zipkin、Pinpoint、CAT等。
上几张图,看看效果,然后再一步一步搭建并使用。
![](https://filescdn.proginn.com/e9b08c4973f254f93338740ee8b53513/650b317224b90b580bc4d00cbe845454.webp)
![](https://filescdn.proginn.com/73b10bdf407fb69effdcb8399cf1ceea/bbe373fb9473e441de95c233fbc6956a.webp)
![](https://filescdn.proginn.com/c2a3f121be3d7a1d3629cccfed676f94/9011812987f5ab4ff1b289906db8c60f.webp)
![](https://filescdn.proginn.com/47d228896b8bd836b5972fe67199fa4b/00654171d8b3126e52f6f2a5630f6eb0.webp)
1、概念与架构
SkyWalking是一个开源监控平台,用于从服务和云原生基础设施收集、分析、聚合和可视化数据。
SkyWalking提供了一种简单的方法来维护分布式系统的清晰视图,甚至可以跨云查看。它是一种现代APM,专门为云原生、基于容器的分布式系统设计。
SkyWalking从三个维度对应用进行监视:service(服务), service instance(实例), endpoint(端点)
服务和实例就不多说了,端点是服务中的某个路径或者说URI
SkyWalking allows users to understand the topology relationship between Services and Endpoints, to view the metrics of every Service/Service Instance/Endpoint and to set alarm rules.
SkyWalking允许用户了解服务和端点之间的拓扑关系,查看每个服务/服务实例/端点的度量,并设置警报规则。
1.1. 架构
![](https://filescdn.proginn.com/cbf346ea635bfc36e3075380d0b1d662/e6cce0c589668e790c86127ecdfe4e2b.webp)
SkyWalking逻辑上分为四个部分:Probes(探针), Platform backend(平台后端), Storage(存储), UI
这个结构就很清晰了,探针就是Agent负责采集数据并上报给服务端,服务端对数据进行处理和存储,UI负责展示
![](https://filescdn.proginn.com/ebc18ee20289509c24315e2c498be973/15d94402c1a4697e12f52b0a99342cb0.webp)
2、下载与安装
SkyWalking有两中版本,ES版本和非ES版。
如果我们决定采用ElasticSearch作为存储,那么就下载es版本。
https://skywalking.apache.org/downloads/
https://archive.apache.org/dist/skywalking/
![](https://filescdn.proginn.com/2ce93efaf8d3807e06630ca846667a3f/355cf63a0f11dba714c3208c4cc702f1.webp)
![](https://filescdn.proginn.com/450217e3d7fd5dcb95b9809e3c0ae080/0041db60fba6f525e44edd0b1afd5b27.webp)
agent目录将来要拷贝到各服务所在机器上用作探针
bin目录是服务启动脚本
config目录是配置文件
oap-libs目录是oap服务运行所需的jar包
webapp目录是web服务运行所需的jar包
接下来,要选择存储了,支持的存储有:
H2 ElasticSearch 6, 7 MySQL TiDB InfluxDB
作为监控系统,首先排除H2和MySQL,这里推荐InfluxDB,它本身就是时序数据库,非常适合这种场景
但是InfluxDB我不是很熟悉,所以这里先用ElasticSearch7
https://github.com/apache/skywalking/blob/master/docs/en/setup/backend/backend-storage.md
2.1. 安装ElasticSearch
https://www.elastic.co/guide/en/elasticsearch/reference/7.10/targz.html
# 启动
./bin/elasticsearch -d -p pid
# 停止
pkill -F pid
![](https://filescdn.proginn.com/5a40eec51edb863127cc3a25b6f4e50f/2180e936617437f63cffa6f2d8864333.webp)
ElasticSearch7.x需要Java 11以上的版本,但是如果你设置了环境变量JAVA_HOME的话,它会用你自己的Java版本。Java 系列面试题和答案我都整理好了,关注公众号Java技术栈,在后台回复:面试。
通常,启动过程中会报以下三个错误:
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
[2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[3]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
解决方法:
在 /etc/security/limits.conf 文件中追加以下内容:
* soft nofile 65536
* hard nofile 65536
* soft nproc 4096
* hard nproc 4096
可通过以下四个命令查看修改结果:
ulimit -Hn
ulimit -Sn
ulimit -Hu
ulimit -Su
修改 /etc/sysctl.conf 文件,追加以下内容:
vm.max_map_count=262144
修改es配置文件 elasticsearch.yml 取消注释,保留一个节点
cluster.initial_master_nodes: ["node-1"]
为了能够ip:port方式访问,还需修改网络配置
network.host: 0.0.0.0
修改完是这样的:
![](https://filescdn.proginn.com/bbbf4462adf785970fb3e5c3f11c1aec/dded6b1b7933e0496d2514bce8e17cde.webp)
![](https://filescdn.proginn.com/cd6fb15839b03fb131fe3dc22738ee34/26d1858c0994052234fc551a7e7a53e0.webp)
至此,ElasticSearch算是启动成功了
一个节点还不够,这里用三个节点搭建一个集群
192.168.100.14 config/elasticsearch.yml
cluster.name: my-monitor
node.name: node-1
network.host: 192.168.100.14
http.port: 9200
discovery.seed_hosts: ["192.168.100.14:9300", "192.168.100.15:9300", "192.168.100.19:9300"]
cluster.initial_master_nodes: ["node-1"]
192.168.100.15 config/elasticsearch.yml
cluster.name: my-monitor
node.name: node-2
network.host: 192.168.100.15
http.port: 9200
discovery.seed_hosts: ["192.168.100.14:9300", "192.168.100.15:9300", "192.168.100.19:9300"]
cluster.initial_master_nodes: ["node-1"]
192.168.100.19 config/elasticsearch.yml
cluster.name: my-monitor
node.name: node-3
network.host: 192.168.100.19
http.port: 9200
discovery.seed_hosts: ["192.168.100.14:9300", "192.168.100.15:9300", "192.168.100.19:9300"]
cluster.initial_master_nodes: ["node-1"]
同时,建议修改三个节点config/jvm.options
-Xms2g
-Xmx2g
依次启动三个节点
pkill -F pid
./bin/elasticsearch -d -p pid
![](https://filescdn.proginn.com/cef70ccd0213c63e12c0f25861e27864/98099ec2eed42642af18f28120840cf8.webp)
![](https://filescdn.proginn.com/c6e8294540ffe5bbe878a09f76a6252c/bb68e81ab97268bc2eeab4083766e015.webp)
![](https://filescdn.proginn.com/89d92bfe3f7d743c07011758eaf6b5f0/a61f74e5911bc732ae8879988e3580c9.webp)
接下来,修改skywalking下config/application.yml 中配置es地址即可
storage:
selector: ${SW_STORAGE:elasticsearch7}
elasticsearch7:
nameSpace: ${SW_NAMESPACE:""}
clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.100.14:9200,192.168.100.15:9200,192.168.100.19:9200}
2.2. 安装Agent
https://github.com/apache/skywalking/blob/v8.2.0/docs/en/setup/service-agent/java-agent/README.md
将agent目录拷贝至各服务所在的机器上
scp -r ./agent chengjs@192.168.100.12:~/
这里,我将它拷贝至各个服务目录下
![](https://filescdn.proginn.com/f30a657c4009de70c290967cda6f294c/1394caf4d89024c7509acc26932b9aff.webp)
plugins是探针用到各种插件,SkyWalking插件都是即插即用的,可以把optional-plugins中的插件放到plugins中
修改 agent/config/agent.config 配置文件,也可以通过命令行参数指定
主要是配置服务名称和后端服务地址
agent.service_name=${SW_AGENT_NAME:user-center}
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:192.168.100.17:11800}
当然,也可以通过环境变量或系统属性的方式来设置,例如:
export SW_AGENT_COLLECTOR_BACKEND_SERVICES=127.0.0.1:11800
最后,在服务启动的时候用命令行参数 -javaagent 来指定探针
java -javaagent:/path/to/skywalking-agent/skywalking-agent.jar -jar yourApp.jar
例如:
java -javaagent:./agent/skywalking-agent.jar -Dspring.profiles.active=dev -Xms512m -Xmx1024m -jar demo-0.0.1-SNAPSHOT.jar
3、启动服务
修改 webapp/webapp.yml 文件,更改端口号及后端服务地址
server:
port: 9000
collector:
path: /graphql
ribbon:
ReadTimeout: 10000
# Point to all backend's restHost:restPort, split by ,
listOfServers: 127.0.0.1:12800
启动服务
bin/startup.sh
或者分别依次启动
bin/oapService.sh
bin/webappService.sh
查看logs目录下的日志文件,看是否启动成功
浏览器访问 http://127.0.0.1:9000
4、告警
![](https://filescdn.proginn.com/52ac78e82019599928ae89d56f67cac4/832e5f0cc100202f23b7a474d02d1225.webp)
编辑 alarm-settings.yml 设置告警规则和通知:
https://github.com/apache/skywalking/blob/v8.2.0/docs/en/setup/backend/backend-alarm.md
重点说下告警通知
![](https://filescdn.proginn.com/9d1f82b471b6eecfa439f2b4ee07b2ba/65491e8be5f37c6556159f6c0b40afa8.webp)
![](https://filescdn.proginn.com/72dbaa21c9ff3aea21b7f246d663e362/390cb89db7a8c9cced377083290601eb.webp)
为了使用钉钉机器人通知,接下来,新建一个项目:
"1.0" encoding="UTF-8"?>
"http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
4.0.0
org.springframework.boot
spring-boot-starter-parent
2.4.0
com.wt.monitor
skywalking-alarm
1.0.0-SNAPSHOT
skywalking-alarm
1.8
org.springframework.boot
spring-boot-starter-web
com.aliyun
alibaba-dingtalk-service-sdk
1.0.1
commons-codec
commons-codec
1.15
com.alibaba
fastjson
1.2.75
org.projectlombok
lombok
true
org.springframework.boot
spring-boot-maven-plugin
Spring Boot 基础就不介绍了,推荐看下这个教程:
https://github.com/javastacks/spring-boot-best-practice
可选依赖(不建议引入)
org.apache.skywalking
server-core
8.2.0
定义告警消息实体类:
package com.wt.monitor.skywalking.alarm.domain;
import lombok.Data;
import java.io.Serializable;
/**
* @author ChengJianSheng
* @date 2020/12/1
*/
@Data
public class AlarmMessageDTO implements Serializable {
private int scopeId;
private String scope;
/**
* Target scope entity name
*/
private String name;
private String id0;
private String id1;
private String ruleName;
/**
* Alarm text message
*/
private String alarmMessage;
/**
* Alarm time measured in milliseconds
*/
private long startTime;
}
发送钉钉机器人消息:
package com.wt.monitor.skywalking.alarm.service;
import com.dingtalk.api.DefaultDingTalkClient;
import com.dingtalk.api.DingTalkClient;
import com.dingtalk.api.request.OapiRobotSendRequest;
import com.taobao.api.ApiException;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.codec.binary.Base64;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;
import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;
import java.io.UnsupportedEncodingException;
import java.net.URLEncoder;
import java.security.InvalidKeyException;
import java.security.NoSuchAlgorithmException;
/**
* https://ding-doc.dingtalk.com/doc#/serverapi2/qf2nxq
* @author ChengJianSheng
* @data 2020/12/1
*/
@Slf4j
@Service
public class DingTalkAlarmService {
@Value("${dingtalk.webhook}")
private String webhook;
@Value("${dingtalk.secret}")
private String secret;
public void sendMessage(String content) {
try {
Long timestamp = System.currentTimeMillis();
String stringToSign = timestamp + "\n" + secret;
Mac mac = Mac.getInstance("HmacSHA256");
mac.init(new SecretKeySpec(secret.getBytes("UTF-8"), "HmacSHA256"));
byte[] signData = mac.doFinal(stringToSign.getBytes("UTF-8"));
String sign = URLEncoder.encode(new String(Base64.encodeBase64(signData)),"UTF-8");
String serverUrl = webhook + "×tamp=" + timestamp + "&sign=" + sign;
DingTalkClient client = new DefaultDingTalkClient(serverUrl);
OapiRobotSendRequest request = new OapiRobotSendRequest();
request.setMsgtype("text");
OapiRobotSendRequest.Text text = new OapiRobotSendRequest.Text();
text.setContent(content);
request.setText(text);
client.execute(request);
} catch (ApiException e) {
e.printStackTrace();
log.error(e.getMessage(), e);
} catch (NoSuchAlgorithmException e) {
e.printStackTrace();
log.error(e.getMessage(), e);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
log.error(e.getMessage(), e);
} catch (InvalidKeyException e) {
e.printStackTrace();
log.error(e.getMessage(), e);
}
}
}
AlarmController.java
package com.wt.monitor.skywalking.alarm.controller;
import com.alibaba.fastjson.JSON;
import com.wt.monitor.skywalking.alarm.domain.AlarmMessageDTO;
import com.wt.monitor.skywalking.alarm.service.DingTalkAlarmService;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.text.MessageFormat;
import java.util.List;
/**
* @author ChengJianSheng
* @date 2020/12/1
*/
@Slf4j
@RestController
@RequestMapping("/skywalking")
public class AlarmController {
@Autowired
private DingTalkAlarmService dingTalkAlarmService;
@PostMapping("/alarm")
public void alarm(@RequestBody List alarmMessageDTOList) {
log.info("收到告警信息: {}", JSON.toJSONString(alarmMessageDTOList));
if (null != alarmMessageDTOList) {
alarmMessageDTOList.forEach(e->dingTalkAlarmService.sendMessage(MessageFormat.format("-----来自SkyWalking的告警-----\n【名称】: {0}\n【消息】: {1}\n", e.getName(), e.getAlarmMessage())));
}
}
}
![](https://filescdn.proginn.com/9f3d8fc9dadf530ce39360fd812665ff/5155647dab4ff146540195d4d6cdb7e6.webp)
参考文档:
https://skywalking.apache.org/
https://skywalking.apache.org/zh/\ https://github.com/apache/skywalking/tree/v8.2.0/docs
https://archive.apache.org/dist/
https://www.elastic.co/guide/en/elasticsearch/reference/master/index.html
https://www.elastic.co/guide/en/elasticsearch/reference/7.10/modules-discovery-bootstrap-cluster.html
https://www.elastic.co/guide/en/elasticsearch/reference/7.10/modules-discovery-hosts-providers.html
最后,感谢阅读~
![](https://filescdn.proginn.com/36d2ca64cb659fea464cf799b8dc02ba/3b0bfd56c34646ff8989306b255c1ede.webp)
![](https://filescdn.proginn.com/b0df9cf2e1af198196c7b997ec7d36ec/364df897a2d1aee7cb58217ef38b662b.webp)
![](https://filescdn.proginn.com/a9f27d3672c2c17c1526a85b9cd352f7/e7f7148c5b6246b5a1916adc3a7c72ff.webp)
![](https://filescdn.proginn.com/4abd826c5c0686c87d9a8c2cb47e7659/21e92a83f3edd692887324a7047a5626.webp)
![](https://filescdn.proginn.com/1b60346a6e787d9e496079fe8e63577f/fb13abf09f2eb883b51f7c7eb9959749.webp)
关注Java技术栈看更多干货
![](https://filescdn.proginn.com/09358ff4a61096d81cbed7e405cbbfe0/b5ff0717fd0e63b93145273a32db8a3c.webp)