FileBeat

添码座原创大约 2 分钟

安装配置

FileBeat是一款专用于采集.log文件的轻量级日志采集器。

因为FileBeat本身已经是比较轻量级的工具了，所以就不使用Docker，而是直接在本机部署。

> cd /home/work
> curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.15.1-linux-x86_64.tar.gz
> tar xzvf filebeat-8.7.0-linux-x86_64.tar.gz
> cd filebeat-8.7.0-linux-x86_64

# 查看可用的模块列表
> ./filebeat modules list
Enabled:

Disabled:
activemq
apache
......

例如，如果需要启用ActiveMQ的配置，那么需要执行下面的命令。

> cd /home/work/filebeat-8.7.0-linux-x86_64
> ./filebeat modules enable activemq
Enabled activemq

测试输出

测试将标准输入（键盘）的内容打印到标准输出（控制台）。

修改FileBeat的配置文件filebeat.yml，注意不要弄错yaml文件的缩进。

> cd /home/work/filebeat-8.7.0-linux-x86_64
> vi filebeat.yml

# 增加一个 Filebeat inputs 类型
- type: stdin

# 注释掉默认的 output.elasticsearch:
#output.elasticsearch:
  ## Array of hosts to connect to.
  #hosts: ["localhost:9200"]

# 指定一个新的输出
output.console:
  pretty: true

保存配置后，开启FileBeat。

> cd /home/work/filebeat-8.7.0-linux-x86_64
> ./filebeat -c filebeat.yml

hello, lixingyun
{
  "@timestamp": "2023-04-23T12:30:53.093Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.7.0"
  },
  "input": {
    "type": "stdin"
  },
  "agent": {
    "name": "hadoop",
    "type": "filebeat",
    "version": "8.7.0",
    "ephemeral_id": "3ffaef25-5f6b-4151-b72f-2d737485497c",
    "id": "efae7ec0-4620-48e8-8af3-d1be30af29b8"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "host": {
    "name": "hadoop",
    "mac": [
      "00-50-56-3E-7A-48",
      "02-42-D3-05-7F-85"
    ],
    "hostname": "hadoop",
    "architecture": "x86_64",
    "os": {
      "kernel": "5.14.0-480.el9.x86_64",
      "type": "linux",
      "platform": "centos",
      "version": "9",
      "family": "redhat",
      "name": "CentOS Stream"
    },
    "id": "d25efc7f5ab94588a49d6dfabc6c18d1",
    "containerized": false,
    "ip": [
      "172.16.185.176",
      "fe80::250:56ff:fe3e:7a48"
    ]
  },
  "log": {
    "offset": 0,
    "file": {
      "path": ""
    }
  },
  "message": "hello, lixingyun"
}

发送数据

将刚才的配置还原到初始状态，然后再修改配置，将日志文件内容发送到Kafka。

用官方给出的Kafka配置实例，拷贝过来修改一下就行。

> cd /home/work/filebeat-8.7.0-linux-x86_64
> vi filebeat.yml

# 配置输入
# ============================== Filebeat inputs ===============================
filebeat.inputs:

- type: filestream
  id: my-filestream-id
  # 这里默认是false，改为true
  enabled: true
  # filebeat默认将输入日志保存在/var/log中，也可以指定多个日志文件路径
  paths:
    - /var/log/*.log

# 配置输出
# ================================== Outputs ===================================

output.kafka:
  # initial brokers for reading cluster metadata
  hosts: ["server01:9092"]
  # message topic selection + partitioning
  topic: 'test'
  partition.round_robin:
    reachable_only: false

  required_acks: 1
  compression: gzip
  max_message_bytes: 1000000
  # 数据格式化，仅读取message字段内容
  codec.format:
    string: '%{[message]}'

保存配置后，开启FileBeat。

> cd /home/work/filebeat-8.7.0-linux-x86_64
> ./filebeat -c filebeat.yml

此时就可以在Kafka中消费FileBeat采集过来的数据了。

> cd /home/work/filebeat-8.7.0-linux-x86_64
> ./bin/kafka-console-consumer.sh --bootstrap-server 172.16.185.176:9092 --from-beginning --topic test

感谢支持

更多内容，请移步《超级个体》。