在CenterOS使用grafana搭建日志监控平台
说明
Loki 是主服务器,负责存储日志和处理查询 。
promtail 是代理,负责收集日志并将其发送给 loki 。
Grafana 用于 UI 展示
loki
下载
curl -O -L "https://github.com/grafana/loki/releases/download/v2.3.0/loki-linux-amd64.zip"
安装
mkdir /home/gather/data/loki/{chunks,index}
unzip loki-linux-amd64.zip
mv loki-linux-amd64 /home/gather/data/loki/
chmod a+x "loki-linux-amd64"
下载配置文件
wget https://raw.githubusercontent.com/grafana/loki/master/cmd/loki/loki-local-config.yaml
loki配置文件配置
auth_enabled: false
server:
http_listen_port: 3100 # 端口
ingester:
lifecycler:
address: 127.0.0.1 # 地址
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 1h # Any chunk not receiving new logs in this time will be flushed
max_chunk_age: 1h # All chunks will be flushed when they hit this age, default is 1h
chunk_target_size: 1048576 # Loki will attempt to build chunks up to 1.5MB, flushing first if chunk_idle_period or max_chunk_age is reached first
chunk_retain_period: 30s # Must be greater than index read cache TTL if using an index cache (Default index read cache TTL is 5m)
max_transfer_retries: 0 # Chunk transfers disabled
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /home/gather/data/loki/boltdb-shipper-active
cache_location: /home/gather/data/loki/boltdb-shipper-cache
cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space
shared_store: filesystem
filesystem:
directory: /home/gather/data/loki/chunks
compactor:
working_directory: /home/gather/data/loki/boltdb-shipper-compactor
shared_store: filesystem
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
chunk_store_config:
max_look_back_period: 0s
table_manager:
retention_deletes_enabled: false
retention_period: 0s
ruler:
storage:
type: local
local:
directory: /home/gather/data/loki/rules
rule_path: /home/gather/data/loki/rules-temp
alertmanager_url: http://localhost:9093 # alertmanager报警地址
ring:
kvstore:
store: inmemory
enable_api: true
启动loki
cd /home/gather/data/loki
# 启动Loki命令
nohup ./loki-linux-amd64 -config.file=loki-local-config.yaml > loki.log 2>&1 &
# debug日志
nohup ./loki-linux-amd64 --log.level=debug -config.file=./loki-local-config.yaml > /opt/logs/loki-3100.log 2>&1 &
# 查看启动是否成功(查看3100端口的进程是否存在)
netstat -tunlp | grep 3100
# 根据名称查找进程(执行命令后有下边的显示,则启动成功)
ps -ef | grep loki-linux-amd64
$ root 11037 22022 0 15:44 pts/0 00:00:55 ./loki-linux-amd64 -config.file=loki-local-config.yaml
制作loki系统服务
新建service
vim /usr/lib/systemd/system/loki.service
添加配置
[Unit]
Description=loki
Documentation=https://github.com/grafana/loki/tree/master
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/src/loki-linux-amd64 -config.file=/usr/local/src/loki-local-config.yaml &>> /opt/logs/loki-3100.log # 具体路径可以根据实际情况修改
Restart=on-failure
[Install]
WantedBy=multi-user.target
注册系统服务
# 刷新环境
systemctl daemon-reload
# 启动服务
systemctl start loki
# 服务状态
systemctl status loki
# 开机自启
systemctl enable loki
promtail
下载
curl -O -L "https://github.com/grafana/loki/releases/download/v2.3.0/promtail-linux-amd64.zip"
安装
mkdir /home/gather/data/promtail
unzip promtail-linux-amd64.zip
mv promtail-linux-amd64 /home/gather/data/promtail/
chmod a+x "promtail-linux-amd64"
下载promatil配置文件
wget https://raw.githubusercontent.com/grafana/loki/master/cmd/promtail/promtail-local-config.yaml
修改promtail配置文件,下面内容为示例配置
# prometail.yaml 配置文件
server:
http_listen_port: 9080
grpc_listen_port: 0
# Positions
positions:
filename: /tmp/positions.yaml
# Loki服务器的地址
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: jm-admin
pipeline_stages:
- match:
selector: '{job="jm-admin"}'
stages:
- regex:
expression: '^(?P<time>[\d\s-:,]*)(?P<level>[a-zA-Z]+)\s(?P<pid>[\d]+)\s(?P<content>.*)$'
- labels:
level:
content:
pid:
time:
static_configs:
- targets:
- localhost
labels:
job: jm-admin
host: localhost
__path__: /mnt/data/jm/jm-admin/logs/project.artifactId_IS_UNDEFINED/debug.log
- job_name: tcc-common
static_configs:
- targets:
- localhost
labels:
job: tcc-common
host: localhost
__path__: /mnt/data/tcc/common/log/*.log
- job_name: tcc-admin
pipeline_stages:
- match:
selector: '{job="tcc-admin"}'
stages:
- json:
expressions:
timej: time
pidj: pid
levelj: level
- labels:
levelj:
pidj:
timej:
static_configs:
- targets:
- localhost
labels:
job: tcc-admin
host: localhost
__path__: /mnt/data/tcc/admin/log*.log
- job_name: sqhgy-admin
static_configs:
- targets:
- localhost
labels:
job: sqhgy-admin
host: localhost
__path__: /mnt/data/sq/sqServer/admin/log/*.log
- job_name: sqhgy-api
static_configs:
- targets:
- localhost
labels:
job: sqhgy-api
host: localhost
__path__: /mnt/data/sq/sqServer/api/log/*.log
- job_name: sqhgy-common
static_configs:
- targets:
- ip地址
labels:
job: sqhgy-common
host: ip地址
__path__: /mnt/data/sq/sqServer/common/log/*.log
- job_name: createDataServer
static_configs:
- targets:
- 127.0.0.1
labels:
job: createDataServer
host: 127.0.0.1
__path__: /home/gather/data/createDataServer/log/*/*.log
启动promtail(注意修改路径)
nohup ./promtail-linux-amd64 -config.file=promtail-local-config.yaml > /home/gather/data/promtail/logs/promtail-9080.log 2>&1 &
发送日志方式补充
📚 相关文档
添加依赖
<dependency>
<groupId>cn.allbs</groupId>
<artifactId>allbs-logback</artifactId>
<version>1.1.5</version>
</dependency>
yml添加配置
allbs:
logging:
console:
close-after-start: true
files:
enabled: true
loki:
enabled: true
http-url: http://${LOKI_HOST}:3100/loki/api/v1/push
metrics-enabled: true
制作loki系统服务
新建service
vim /usr/lib/systemd/system/promtail.service
修改配置
[Unit]
Description=promtail
Documentation=https://github.com/grafana/loki/tree/master
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/src/promtail-linux-amd64 -config.file=/usr/local/src/promtail-local-config.yaml &>> /opt/logs/promtail-9080.log # 具体路径可以根据实际情况修改
Restart=on-failure
[Install]
WantedBy=multi-user.target
注册系统服务
# 刷新环境
systemctl daemon-reload
# 启动服务
systemctl start promtail
# 服务状态
systemctl status promtail
# 开机自启
systemctl enable promtail
grafana
下载
wget https://dl.grafana.com/oss/release/grafana-8.1.2-1.x86_64.rpm
# 或
sudo yum install grafana-8.1.2-1.x86_64.rpm
安装
rpm -ivh /data/tools/grafana-7.1.0-1.x86_64.rpm
启动相关
# 刷新
systemctl daemon-reload
# 启动
systemctl start grafana-server
# 开机自启
systemctl enable grafana-server
# 查看状态
systemctl status grafana-server
# 启动
service grafana-server start
验证
api验证
curl "http://127.0.0.1:3100/api/prom/label"
curl localhost:3100/loki/api/v1/labels
使用
备注
访问 ip:port
默认账号密码
admin
admin
第一次需要修改密码
图示
过滤指定内容
语法说明
- |=:日志行包含字符串。
- !=:日志行不包含字符串。
- |~:日志行匹配正则表达式。
- !~:日志行与正则表达式不匹配。
对最近五分钟内的所有日志行进行计数
count_over_time({job="jm-admin"}[5m])
获取在过去十秒内所有非超时错误的每秒速率
rate({job="jm-admin"} |= "error" != "timeout" [10s]
集合运算符
与PromQL一样,LogQL支持内置聚合运算符的一个子集,可用于聚合单个向量的元素,从而产生具有更少元素但具有集合值的新向量
运算符 | 说明 |
---|---|
sum | 计算标签上的总和 |
min | 选择最少的标签 |
max | 选择标签上方的最大值 |
avg | 计算标签上的平均值 |
stddev | 计算标签上的总体标准差 |
stdvar | 计算标签上的总体标准方差 |
count | 计算向量中元素的数量 |
bottomk | 通过样本值选择最小的k个元素 |
topk | 通过样本值选择最大的k个元素 |
统计最高日志吞吐量按container排序前十的应用程序
topk(10,sum(rate({job="fluent-bit"}[5m])) by(container))
获取最近五分钟内的日志计数,按级别分组
sum(count_over_time({job="fluent-bit"}[5m])) by (level)
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 ALLBS!
评论