elk基于jolokia监控springboot应用jvm方案

时间:2022-07-22
本文章向大家介绍elk基于jolokia监控springboot应用jvm方案,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

文章目录

目前大部分应用都采用springboot 的方式部署,springboot 采用jar包发布,而jvm的运行状态又比较关键,因此用elk对jvm监控进行了集成,步骤如下:

一 springboot 项目配置

对于springboot项目,需要的配置是在pom文件里面增加对jolokia的支持:

<!--开启springboot 健康监控 -->
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<!-- jolokia jmx监控 -->
<dependency>
	<groupId>org.jolokia</groupId>
	<artifactId>jolokia-core</artifactId>
	<version>1.5.0</version>
</dependency>

之后启动springboot项目,可以看到jolokia的加载日志:

启动后可以通过http的get请求对其进行访问(springboot启动端口为18002):

#请求命令
curl  -XGET http://127.0.0.1:18002/jolokia/version
#返回json
{"request":{"type":"version"},"value":{"agent":"1.5.0","protocol":"7.2","config":{"listenForHttpService":"true","authIgnoreCerts":"false","agentId":"172.16.20.237-7092-7971c2a9-servlet","debug":"false","agentType":"servlet","policyLocation":"classpath:/jolokia-access.xml","agentContext":"/jolokia","serializeException":"false","mimeType":"text/plain","dispatcherClasses":"org.jolokia.http.Jsr160ProxyNotEnabledByDefaultAnymoreDispatcher","authMode":"basic","streaming":"true","canonicalNaming":"true","historyMaxEntries":"10","allowErrorDetails":"true","allowDnsReverseLookup":"true","realm":"jolokia","includeStackTrace":"true","useRestrictorService":"false","debugMaxEntries":"100"},"info":{"product":"tomcat","vendor":"Apache","version":"8.5.15"}},"timestamp":1530178445,"status":200}

能正常得到一个json,则说明jolokia启动成功。另外,如果所在项目使用了shiro或者其他框架进行了权限验证,那么需要对jolokia的API取消权限拦截。 jolokia的相关API请参见 :

https://jolokia.org/reference/html/protocol.html

二 beats配置

elk采用beat收集数据,对于jolokia,因为提供了要给接口,因此可以定时查询这个API接口进行采集。因此想到了execbeat 这个github上提供的beat。

#github地址
https://github.com/christiangalsterer/execbeat

需要自行便宜。 execbeat的yml配置如下:

################# Execbeat Configuration Example #######################

############################# Input ############################################
execbeat:

  execs:
    # Each - Commands to execute.
    - 
     cron: "@every 30s"
     command: "curl"
     args: "http://192.168.22.144:8187/jolokia/read/java.lang:type=Memory/"
     document_type: "Memory"
     fields:
       index_name: "springboot"
       application_name: "radar-api"
    - 
     cron: "@every 30s"
     command: "curl"
     args: "http://192.168.22.144:8187/jolokia/read/java.lang:type=GarbageCollector,name=*/"
     document_type: "GC"
     fields:
       index_name: "springboot"
       application_name: "radar-api"
    - 
     cron: "@every 30s"
     command: "curl"
     args: "http://192.168.22.144:8187/jolokia/read/java.lang:type=Threading/"
     document_type: "Threading"
     fields:
       index_name: "springboot"
       application_name: "radar-api"

############################# Output ##########################################

# Configure what outputs to use when sending the data collected by the beat.
# Multiple outputs may be used.
output:
  redis:
    host_topology: 'eth0'
    host: '192.168.21.120'
    port: 6379
    save_topology: true
    index: 'execbeat'
    db: 0
    db_topology: 1

为了后续在一台服务器上监听多个jvm,因此增加了一个自定义字段 application_name。

二 logstash配置

数据通过beats采集,存放到了redis,之后部署了多个logstash节点进行负载均衡。也可以考虑将数据存到kafka,这样的好处在于不用考虑es故障的情况下可能将redis所在服务器的内存撑爆。 #01-redis-input.conf 文件

  #SpringBoot监控数据
  redis {
    data_type => "list"
    key => "execbeat"
    host => "192.168.21.120"
    port => 6379
    db => 0
    threads => 1
  }

#15-execbeat-filter.conf

##注:此段配置由于之前对springboot的health、activeMQ的jvm做过监控,因此都保留了。
filter{
  if [beat][name] == "execbeat" {
    if [fields][index_name] in ["activemq","springboot"] {
      #删除请求数据为空的所有事件   
      if ![exec][stdout] or [exec][stdout] == "" {
        drop{}
      }  
      else if [type] == "Health" {
        json {
          source => ["[exec][stdout]"]
          target => "Health"
        }
      } else if [type] == "Threading" {
        json {
          source => ["[exec][stdout]"]
          target => "Threading"
        }
      } else if [type] == "Memory" {
        json {
          source => ["[exec][stdout]"]
          target => "Memory"
        }
      } else if [type] == "Queue" {
        json {
          source => ["[exec][stdout]"]
          target => "Queue"
        }
        ruby {
          code => "
            params = event.get('Queue')['value'] && event.get('Queue')['value'].to_hash
            params.keys.each { |k| params[ k.split(',')[1].split('=')[1] ] = params.delete(k) if k.include?',' } unless params.nil?
            params.keys.each { |k| params[ k.gsub('.','_') ] = params.delete(k) if k.include?'.' } unless params.nil?
            event.set('Queue',params)
          "
        }
      } else if [type] == "Topic" {
        json {
          source => ["[exec][stdout]"]
          target => "Topic"
        }
        ruby {
          code => "
            params = event.get('Topic')['value'] && event.get('Topic')['value'].to_hash
            params.keys.each { |k| params[ k.split(',')[1].split('=')[1] ] = params.delete(k) if k.include?',' } unless params.nil?
            params.keys.each { |k| params[ k.gsub('.','_') ] = params.delete(k) if k.include?'.' } unless params.nil?
            event.set('Topic',params)
          "
        }
      } else if [type] == "GC" {
        json {
          source => ["[exec][stdout]"]
          target => "GC"
        }
        ruby {
          code => "
            params = event.get('GC')['value'] && event.get('GC')['value'].to_hash
            params.keys.each { |k| params[ k.split(',')[0].split('=')[1] ] = params.delete(k) if k.include?',' } unless params.nil?
            params.keys.each { |k| params[ k.gsub('.','_') ] = params.delete(k) if k.include?'.' } unless params.nil?
            event.set('GC',params)
          "
        }
      }
      #统一去除curl的stderr信息
      mutate {
        remove_field => ["[exec][stderr]","[exec][stdout]"]
      }

      #删除ruby解析错误的事件  
      if "_rubyexception" in [tags] {
        drop{}
      }    

    } else if  [fields][index_name] == "springcloud"  {
      if [type] == "health" {
        json {
          source => ["[exec][stdout]"]
          target => "health"
        }
        mutate {
          remove_field => ["[exec][stderr]","[exec][stdout]"]
        }
      } else if [type] == "metrics" {
        json {
          source => ["[exec][stdout]"]
          target => "metrics"
        }
        ruby {
          code => "
            params = event.get('metrics') && event.get('metrics').to_hash
            params.keys.each { |k| params[ k.gsub('.','_') ] = params.delete(k) if k.include?'.' } unless params.nil?
            event.set('metrics',params)
          "
        }
        mutate {
          remove_field => ["[exec][stderr]","[exec][stdout]"]
        }
      }
    }
  }
}

上述配置文件支持对activeMQ和springcloud Health的监控。 另外有个比较坑的地方在于,通过curl请求的过程中,由于execbeat中不能增加 -s参数,因此会有很多进度信息输出,最后在脚本中转换处理。实际上也可以修改git的脚本进行处理,考虑到对golang的掌握程度,因此采用了在ruby脚本处理的方案。

30-elasticsearch-output.conf

output {
 if [fields][index_name] == "springboot" {
    elasticsearch {
      hosts => ["192.168.21.23:9201","192.168.21.23:9202","192.168.21.23:9203"]
      sniffing => false
      manage_template => false
      index => "springboot-%{+YYYY.MM.dd}"
    }
  }
}

启动logstash之后,在kibana中就能看到相关的数据了 。

采集了GC、Memory、Thread等三类数据,如果需要更多的数据,可以查询jolokia的相关文档,采集所需要的数据。

二 kibana监控图

Visualizations 导出

[
  {
    "_id": "d7619300-792f-11e8-b57c-e55ae913cda6",
    "_type": "visualization",
    "_source": {
      "title": "springboot 守护线程数",
      "visState": "{"title":"springboot 守护线程数","type":"line","params":{"grid":{"categoryLines":false,"style":{"color":"#eee"}},"categoryAxes":[{"id":"CategoryAxis-1","type":"category","position":"bottom","show":true,"style":{},"scale":{"type":"linear"},"labels":{"show":true,"truncate":100},"title":{"text":"时间"}}],"valueAxes":[{"id":"ValueAxis-1","name":"LeftAxis-1","type":"value","position":"left","show":true,"style":{},"scale":{"type":"linear","mode":"normal"},"labels":{"show":true,"rotate":0,"filter":false,"truncate":100},"title":{"text":"守护线程数"}}],"seriesParams":[{"show":"true","type":"line","mode":"normal","data":{"label":"守护线程数","id":"1"},"valueAxis":"ValueAxis-1","drawLinesBetweenPoints":true,"showCircles":true}],"addTooltip":true,"addLegend":true,"legendPosition":"right","times":[],"addTimeMarker":true},"aggs":[{"id":"1","enabled":true,"type":"max","schema":"metric","params":{"field":"Threading.value.DaemonThreadCount","customLabel":"守护线程数"}},{"id":"2","enabled":true,"type":"date_histogram","schema":"segment","params":{"field":"@timestamp","interval":"auto","customInterval":"2h","min_doc_count":1,"extended_bounds":{},"customLabel":"时间"}},{"id":"3","enabled":true,"type":"terms","schema":"group","params":{"field":"fields.application_name.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"应用名称"}},{"id":"4","enabled":true,"type":"terms","schema":"split","params":{"field":"beat.hostname.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"机器名","row":true}}],"listeners":{}}",
      "uiStateJSON": "{}",
      "description": "",
      "version": 1,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{"index":"springboot-*","query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":[]}"
      }
    }
  },
  {
    "_id": "77c2a510-7911-11e8-bbc2-599289cbcc66",
    "_type": "visualization",
    "_source": {
      "title": "springboot CMS回收器GC耗时",
      "visState": "{"title":"springboot CMS回收器GC耗时","type":"area","params":{"grid":{"categoryLines":true,"style":{"color":"#eee"},"valueAxis":"ValueAxis-1"},"categoryAxes":[{"id":"CategoryAxis-1","type":"category","position":"bottom","show":true,"style":{},"scale":{"type":"linear"},"labels":{"show":true,"truncate":100},"title":{"text":"时间"}}],"valueAxes":[{"id":"ValueAxis-1","name":"LeftAxis-1","type":"value","position":"left","show":true,"style":{},"scale":{"type":"linear","mode":"normal"},"labels":{"show":true,"rotate":0,"filter":false,"truncate":100},"title":{"text":"CMS回收器GC耗时"}}],"seriesParams":[{"show":"true","type":"area","mode":"stacked","data":{"label":"CMS回收器GC耗时","id":"1"},"drawLinesBetweenPoints":true,"showCircles":true,"interpolate":"linear","valueAxis":"ValueAxis-1"}],"addTooltip":true,"addLegend":true,"legendPosition":"right","times":[],"addTimeMarker":true},"aggs":[{"id":"1","enabled":true,"type":"max","schema":"metric","params":{"field":"GC.ConcurrentMarkSweep.LastGcInfo.duration","customLabel":"CMS回收器GC耗时"}},{"id":"2","enabled":true,"type":"date_histogram","schema":"segment","params":{"field":"@timestamp","interval":"auto","customInterval":"2h","min_doc_count":1,"extended_bounds":{},"customLabel":"时间"}},{"id":"4","enabled":true,"type":"terms","schema":"group","params":{"field":"fields.application_name.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"应用名称"}},{"id":"5","enabled":true,"type":"terms","schema":"split","params":{"field":"beat.hostname.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"机器名","row":true}}],"listeners":{}}",
      "uiStateJSON": "{}",
      "description": "",
      "version": 1,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{"index":"springboot-*","query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":[]}"
      }
    }
  },
  {
    "_id": "9ab9dcf0-791b-11e8-a6ce-c519e8849015",
    "_type": "visualization",
    "_source": {
      "title": "springboot 堆内存大小",
      "visState": "{"title":"springboot 堆内存大小","type":"line","params":{"grid":{"categoryLines":true,"style":{"color":"#eee"},"valueAxis":"ValueAxis-1"},"categoryAxes":[{"id":"CategoryAxis-1","type":"category","position":"bottom","show":true,"style":{},"scale":{"type":"linear"},"labels":{"show":true,"truncate":100},"title":{"text":"时间"}}],"valueAxes":[{"id":"ValueAxis-1","name":"LeftAxis-1","type":"value","position":"left","show":true,"style":{},"scale":{"type":"linear","mode":"normal"},"labels":{"show":true,"rotate":0,"filter":false,"truncate":100},"title":{"text":"堆内存"}}],"seriesParams":[{"show":"true","type":"line","mode":"normal","data":{"label":"堆内存","id":"1"},"valueAxis":"ValueAxis-1","drawLinesBetweenPoints":true,"showCircles":true}],"addTooltip":true,"addLegend":true,"legendPosition":"right","times":[],"addTimeMarker":true},"aggs":[{"id":"1","enabled":true,"type":"max","schema":"metric","params":{"field":"Memory.value.HeapMemoryUsage.used","customLabel":"堆内存"}},{"id":"2","enabled":true,"type":"date_histogram","schema":"segment","params":{"field":"@timestamp","interval":"auto","customInterval":"2h","min_doc_count":1,"extended_bounds":{},"customLabel":"时间"}},{"id":"3","enabled":true,"type":"terms","schema":"group","params":{"field":"fields.application_name.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"应用名称"}},{"id":"4","enabled":true,"type":"terms","schema":"split","params":{"field":"beat.hostname.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"机器名","row":true}}],"listeners":{}}",
      "uiStateJSON": "{}",
      "description": "",
      "version": 1,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{"index":"springboot-*","query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":[]}"
      }
    }
  },
  {
    "_id": "30d03400-7917-11e8-a6ce-c519e8849015",
    "_type": "visualization",
    "_source": {
      "title": "springboot PN回收器GC耗时",
      "visState": "{"title":"springboot PN回收器GC耗时","type":"area","params":{"grid":{"categoryLines":true,"style":{"color":"#eee"},"valueAxis":"ValueAxis-1"},"categoryAxes":[{"id":"CategoryAxis-1","type":"category","position":"bottom","show":true,"style":{},"scale":{"type":"linear"},"labels":{"show":true,"truncate":100},"title":{"text":"时间"}}],"valueAxes":[{"id":"ValueAxis-1","name":"LeftAxis-1","type":"value","position":"left","show":true,"style":{},"scale":{"type":"linear","mode":"normal"},"labels":{"show":true,"rotate":0,"filter":false,"truncate":100},"title":{"text":"PN回收器GC耗时"}}],"seriesParams":[{"show":"true","type":"area","mode":"stacked","data":{"label":"PN回收器GC耗时","id":"1"},"drawLinesBetweenPoints":true,"showCircles":true,"interpolate":"linear","valueAxis":"ValueAxis-1"}],"addTooltip":true,"addLegend":true,"legendPosition":"right","times":[],"addTimeMarker":true},"aggs":[{"id":"1","enabled":true,"type":"max","schema":"metric","params":{"field":"GC.ParNew.LastGcInfo.duration","customLabel":"PN回收器GC耗时"}},{"id":"2","enabled":true,"type":"date_histogram","schema":"segment","params":{"field":"@timestamp","interval":"auto","customInterval":"2h","min_doc_count":1,"extended_bounds":{},"customLabel":"时间"}},{"id":"3","enabled":true,"type":"terms","schema":"group","params":{"field":"fields.application_name.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"应用名称"}},{"id":"4","enabled":true,"type":"terms","schema":"split","params":{"field":"beat.hostname.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"机器名","row":true}}],"listeners":{}}",
      "uiStateJSON": "{}",
      "description": "",
      "version": 1,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{"index":"springboot-*","query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":[]}"
      }
    }
  },
  {
    "_id": "11b2d7f0-792e-11e8-a6ce-c519e8849015",
    "_type": "visualization",
    "_source": {
      "title": "springboot 总线程数",
      "visState": "{"title":"springboot 总线程数","type":"line","params":{"grid":{"categoryLines":true,"style":{"color":"#eee"},"valueAxis":"ValueAxis-1"},"categoryAxes":[{"id":"CategoryAxis-1","type":"category","position":"bottom","show":true,"style":{},"scale":{"type":"linear"},"labels":{"show":true,"truncate":100},"title":{"text":"时间"}}],"valueAxes":[{"id":"ValueAxis-1","name":"LeftAxis-1","type":"value","position":"left","show":true,"style":{},"scale":{"type":"linear","mode":"normal"},"labels":{"show":true,"rotate":0,"filter":false,"truncate":100},"title":{"text":"总线程数"}}],"seriesParams":[{"show":"true","type":"line","mode":"normal","data":{"label":"总线程数","id":"1"},"valueAxis":"ValueAxis-1","drawLinesBetweenPoints":true,"showCircles":true}],"addTooltip":true,"addLegend":true,"legendPosition":"right","times":[],"addTimeMarker":true},"aggs":[{"id":"1","enabled":true,"type":"max","schema":"metric","params":{"field":"Threading.value.ThreadCount","customLabel":"总线程数"}},{"id":"2","enabled":true,"type":"date_histogram","schema":"segment","params":{"field":"@timestamp","interval":"auto","customInterval":"2h","min_doc_count":1,"extended_bounds":{},"customLabel":"时间"}},{"id":"3","enabled":true,"type":"terms","schema":"group","params":{"field":"fields.application_name.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"应用名称"}},{"id":"4","enabled":true,"type":"terms","schema":"split","params":{"field":"beat.hostname.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"机器名","row":true}}],"listeners":{}}",
      "uiStateJSON": "{}",
      "description": "",
      "version": 1,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{"index":"springboot-*","query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":[]}"
      }
    }
  },
  {
    "_id": "01da8ef0-792d-11e8-a6ce-c519e8849015",
    "_type": "visualization",
    "_source": {
      "title": "springboot 非堆内存大小",
      "visState": "{"title":"springboot 非堆内存大小","type":"line","params":{"grid":{"categoryLines":true,"style":{"color":"#eee"},"valueAxis":"ValueAxis-1"},"categoryAxes":[{"id":"CategoryAxis-1","type":"category","position":"bottom","show":true,"style":{},"scale":{"type":"linear"},"labels":{"show":true,"truncate":100},"title":{"text":"时间"}}],"valueAxes":[{"id":"ValueAxis-1","name":"LeftAxis-1","type":"value","position":"left","show":true,"style":{},"scale":{"type":"linear","mode":"normal"},"labels":{"show":true,"rotate":0,"filter":false,"truncate":100},"title":{"text":"非堆内存"}}],"seriesParams":[{"show":"true","type":"line","mode":"normal","data":{"label":"非堆内存","id":"1"},"valueAxis":"ValueAxis-1","drawLinesBetweenPoints":true,"showCircles":true}],"addTooltip":true,"addLegend":true,"legendPosition":"right","times":[],"addTimeMarker":true},"aggs":[{"id":"1","enabled":true,"type":"max","schema":"metric","params":{"field":"Memory.value.NonHeapMemoryUsage.used","customLabel":"非堆内存"}},{"id":"2","enabled":true,"type":"date_histogram","schema":"segment","params":{"field":"@timestamp","interval":"auto","customInterval":"2h","min_doc_count":1,"extended_bounds":{},"customLabel":"时间"}},{"id":"3","enabled":true,"type":"terms","schema":"split","params":{"field":"beat.hostname.keyword","size":5,"order":"desc","orderBy":"1","customLabel":"机器名","row":true}}],"listeners":{}}",
      "uiStateJSON": "{}",
      "description": "",
      "version": 1,
      "kibanaSavedObjectMeta": {
        "searchSourceJSON": "{"index":"springboot-*","query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":[]}"
      }
    }
  }
]

上述是对做的几个图的导出配置,可以组成一个dashboard,效果如下:

jvm监控图就搞定, 大家可以根据要求自行配置所需要的图。另外jvm的报警,官方自带的报警功能是需要付费的,因此可以自行开发报警系统,定期查询es中的数据,然后进行阈值配置即可。