首页 > 数据库> > 大数据组件Logstash日志采集和filebeat采集日志与数据库交互（1+x证书方向）

大数据组件Logstash日志采集和filebeat采集日志与数据库交互（1+x证书方向）

2022-10-02 15:57:58 作者：互联网

大数据组件Logstash日志采集和filebeat采集日志与数据库交互（1+x证书方向）

一、logstash的应用安装与部署

1.logstash的应用

logstash作为日志采集，转换工具，是ETL的一部分，Logstash 的作用就
是一个数据收集器，将各种格式各种渠道的数据通过它收集解析之后格式化输
出到Elasticsearch ，最后再由Kibana 提供的比较友好的Web 界面进行
汇总、分析、搜索。

2.logstash的安装

首先准备logstash的压缩包：logstash-7.7.0.tar.gz 开始解压缩安装文件：

tar zxvf logstash-7.7.0.tar.gz

解压完成后重命名：

mv logstash-7.7.0.tar.gz logstash

准备解压工作已经完成

3.安装logstash-output-jdbc组件

方法一(网络安装法)：进入 logstash的/bin目录，然后执行：

./logstash-plugin install logstash-output-jdbc

网络下载较慢，需等待5分钟左右，成功结果如下：

Validating logstash-output-jdbc
Installing logstash-output-jdbc
Installation successful

方法二(离线安装logstash-output-jdbc.zip组件)：

同样切换到logstash的bin目录下，使用命令：注：我的logstash-output-jdbc.zip文件在桌面(file:后面是logstash-output-jdbc.zip的路径)

logstash-plugin install file:/root/Desktop/logstash-output-jdbc.zip

上述两种方法安装完后使用命令查看是否组件加载成功：

logstash-plugin list | grep jdbc

结果如下：

logstash-integration-jdbc
|__ logstash-input-jdbc
|__ logstash-filter-jdbc_streaming
|__ logstash-filter-jdbc_static
logstash-output-jdbc

4.配置conf文件启动logstash

切换到logstash的config目录下(当前处于logstash目录下)：

logstash]$ cd config/

创建打开一个conf文件1.conf：

使用命令查看是否安装完成：

```powershell
logstash-plugin list | grep jdbc

结果如下：

vim 1.conf

内容如下（按下 i 键即可插入文字）：

input{
          
   
  stdin{
          
   
  }
}
output{
          
   
  stdout{
          
   

  }
}

保存并退出（按下ESC键，并输入 :wq）

启动1.conf命令（当前处于logstash文件下）：

bin/logstash -f config/1.conf

出现如下提示并且不自动退出即为成功（若当前启动或后续其他conf文件在启动时直接自动退出，应当返回检查conf文件内容是否书写正确，格式是否错误）：

Successfully started Logstash API endpoint {
          
   :port=>9600}

输入Mary，出现如下所示信息即为成功：

5.配置mysql用于接收logstash输入的数据

打开命令窗口输入数据库命令： password为自己数据库的root密码

mysql -uroot -ppassword

创建新数据库logstash：

create database logstash character set utf8;

创建新数据表logs：

use logstash;

create table logs(  
id bigint primary key auto_increment,  
message varchar(4000),  
host varchar(100),  
create_time timestamp default now());

创建完数据表后，安装配置数据库连接工具：在logstash的vendor文件夹下新建一个名为jar的文件夹，再在jar下新建一个jdbc的文件夹最后将插件mysql-connector-java-5.1.47.jar放在jdbc文件夹内：

[root@node1 jdbc]$ pwd
/root/Desktop/logstash/vendor/jar/jdbc
[root@node1 jdbc]$ ll
total 984
-rw-r--r-- 1 root root 1007502 6u6708   4 15:08 mysql-connector-java-5.1.47.jar

切换到logstash的config文件夹下，新建配置文件2.conf：

vim 2.conf

内容如下（password为自己数据库root用户密码）：

input{
          
   
  stdin{
          
    }
}
output{
          
   
  jdbc{
          
   
    driver_class=>"com.mysql.jdbc.Driver"
    connection_string=>"jdbc:mysql://localhost:3306/logstash?characterEncoding=UTF-8"
    username=>"root"
    password=>"password"
    statement=>["insert into logs(message,host) values(?,?)","message","host"]
  }
}
output{
          
   
  stdout{
          
     }
}

保存并退出切换到logstash的主目录下启动logstash配置2.conf：

bin/logstash -f config/2.conf

成功后输入一下测试（启动成功后依次输入Jack和Mary Rose）：然后我们需要去数据库确认是否安装成功，启动命令进入logstash数据库：

select * from logs;

结果如下所示，即为插入成功：

+----+-----------+------------+---------------------+
| id | message   | host       | create_time         |
+----+-----------+------------+---------------------+
|  1 | Jack      | node1.host | 2020-06-04 15:24:07 |
|  2 | Mary Rose | node1.host | 2020-06-04 15:24:25 |
+----+-----------+------------+---------------------+

. . .

二、filebeat的安装与使用

1.解压并重命名filebeat压缩包filebeat-7.7.0-linux-x86_64.tar.gz：

tar zxvf filebeat-7.7.0-linux-x86_64.tar.gz

重命名解压文件夹为filebeat：

mv filebeat-7.7.0-linux-x86_64 filebeat

. .

2.使用logstash接受filebeat的信息

重新配置我们需要的插件配置信息，新建一个1.yml

vim 1.yml

内容如下（复制即可）：

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: ture

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /root/Desktop/logs/*.log
    #- c:programdataelasticsearchlogs*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: [^DBG]

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: [^ERR, ^WARN]

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: [.gz$]

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  #multiline.match: after


#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${
          
   path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

#================================ General =====================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging


#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

#============================== Kibana =====================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

#============================= Elastic Cloud ==================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

#================================ Outputs =====================================

# Configure what output to use when sending the data collected by the beat.

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
  #hosts: ["localhost:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

#================================ Processors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

#================================ Logging =====================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

#============================== X-Pack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:

#================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true

保存并退出：在filebe文件夹下启动命令

chmod go-w 1.yml

之后来到logstash的config文件夹下新建3.conf文件：

vim 3.conf

内容如下：

input{
          
   
  beats{
          
   
    port=>5044
  }
}
output{
          
   
  stdout{
          
     }
}

保存并退出

再次来到filebeat文件夹下，新建打开一个2.yml文件：

vim 2.yml

内容如下：

filebeat.inputs:
- type: stdin

output.logstash:
  hosts: ["127.0.0.1:5044"]

保存并退出！

接下来开启一个命令窗口切换到logstash文件夹下先启动logstash加载3.conf文件：

bin/logstash -f config/3.conf

开启另一个命令窗口在切换到filebeat启动2.yml

filebeat -e -c 2.yml

启动后输入Jerry，显示如下信息：切换到另一个启动logstash的窗口可以看到到这里filebeat传输信息给logstash已经完成

. . **

3.读取业务日志保存到mysql中

1.上传springboot项目jar包demo-1.0.jar到桌面桌面启动命令：

java -jar demo-1.0.jar

运行完毕后会在桌面留下一个logs的文件夹，里面存有日志文件接下来启动一个命令窗口来到logstash的config目录下，新建打开一个4.conf

vim 4.conf

内容如下：注：password为自己数据库root用户的密码

input{
          
   
  beats{
          
   
    port=>5044
  }
}
output{
          
   
  jdbc{
          
   
    driver_class=>"com.mysql.jdbc.Driver"
    connection_string=>"jdbc:mysql://localhost:3306/logstash?characterEncoding=UTF-8"
    username=>"root"
    password=>"password"
    statement=>["insert into logs(message,host) values(?,?)","message","host"]
  }
}
output{
          
   
  stdout{
          
     }
}

保存并退出

打开另一个命令窗口来到filebeat目录下，新建并打开一个3.yml的文件

vim 3.yml

内容如下：注意：path路径后是跟springboot的log日志所在的路径*.log是指识别后缀名只为.log的日志文件

filebeat.inputs:
- type: log
  paths: 
  - /root/Desktop/logs/*.log

output.logstash:
  hosts: ["127.0.0.1:5044"]

启动一个命令窗口来到logstash下启动logstash加载4.conf：

bin/logstash -f config/4.conf

启动成功后放置一边再启动一个命令窗口来到filebeat目录下加载3.yml

fileneat -e -c 3.yml

filebeat页面出现开始运行：切换至logstash页面可以看到有数据展示：最后打开mysql查看logs数据表数据变化：

select * from logs;

结果如下：

+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------+
| id | message                                                                                                                                                                                                          | host                  | create_time         |
+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+---------------------+
|  1 | Jack                                                                                                                                                                                                             | node1.host            | 2020-06-04 15:24:07 |
|  2 | Mary Rose                                                                                                                                                                                                        | node1.host            | 2020-06-04 15:24:25 |
|  3 | 2020-06-04 16:05:31.656  INFO 2171 --- [main] org.apache.catalina.core.StandardEngine  : Starting Servlet engine: [Apache Tomcat/9.0.35]                                                                         | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
|  4 | 2020-06-04 16:05:30.663  INFO 2171 --- [main] cn.inspur.DemoApplication                : Starting DemoApplication v1.0 on node1.host with PID 2171 (/root/Desktop/demo-1.0.jar started by root in /root/Desktop) | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
|  5 | 2020-06-04 16:05:31.655  INFO 2171 --- [main] o.apache.catalina.core.StandardService   : Starting service [Tomcat]                                                                                               | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
|  6 | 2020-06-04 16:05:31.644  INFO 2171 --- [main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8888 (http)                                                                            | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
|  7 | 2020-06-04 16:05:31.997  INFO 2171 --- [main] cn.inspur.DemoApplication                : Started DemoApplication in 1.708 seconds (JVM running for 2.107)                                                        | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
|  8 | 2020-06-04 16:05:31.846  INFO 2171 --- [main] o.s.s.concurrent.ThreadPoolTaskExecutor  : Initializing ExecutorService applicationTaskExecutor                                                                  | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
|  9 | 2020-06-04 16:05:31.987  INFO 2171 --- [main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8888 (http) with context path                                                              | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
| 10 | 2020-06-04 16:05:30.665  INFO 2171 --- [main] cn.inspur.DemoApplication                : No active profile set, falling back to default profiles: default                                                        | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
| 11 | 2020-06-04 16:05:31.711  INFO 2171 --- [main] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring embedded WebApplicationContext                                                                      | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |
| 12 | 2020-06-04 16:05:31.712  INFO 2171 --- [main] o.s.web.context.ContextLoader            : Root WebApplicationContext: initialization completed in 997 ms                                                          | {
          
   "name":"node1.host"} | 2020-06-04 16:15:18 |

标签：数据,大数据
来源：