数据库
首页 > 数据库> > Redis、Zabbix

Redis、Zabbix

作者:互联网

一、简述 redis 特点及其应用场景

Redis 特点

Redis 典型应用场景

二、对比 redis 的 RDB、AOF 模式的优缺点

1. RDB(Redis DataBase)模式

RDB 工作原理

image.png

RDB 基于时间的快照,其默认只保留当前最新的一次快照,特点是执行速度比较快,缺点是可能会丢失从上次快照到当前时间点之间未做快照的数据。

RDB bgsave(异步)实现快照具体过程

image.png

RDB 模式优缺点

优点

缺点

AOF(AppendOnlyFile)模式

AOF 工作原理

image.png

AOF 按照操作顺序依次将操作追加到指定的日志文件末尾。

注意:

同时启用 RDB 和 AOF,进行恢复时,默认 AOF 文件优先级高于 RDB 文件,即会使用 AOF 文件进行恢复;

AOF 模式默认是关闭的,第一次开启 AOF 后,并重启服务生效后,会因为 AOF 的优先级高于 RDB,而 AOF 默认没有文件存在,从而导致所有数据丢失。

AOF rewrite 重写

将一些重复的,可以合并的,过期的数据重新写入一个新的 AOF 文件,从而节约 AOF 备份占用的硬盘空间,也能加速恢复过程;可以手动执行 bgrewriteaof 触发 AOF,或定义自动 rewrite 策略。

AOF rewrite 过程

image.png

AOF 模式优缺点

优点

缺点

RDB 和 AOF 适用场景

三、实现 redis 哨兵,模拟 master 故障场景

工作原理

image.png

image.png

实现哨兵(sentinel)模式

graph LR M[Sentinel</br>10.0.0.7</br>master] S1[Sentinel</br>10.0.0.17</br>slave1] S2[Sentinel</br>10.0.0.27</br>slave2] M---->S1 M---->S2

配置一主两从

一键编译 redis 安装脚本

#!/bin/bash
# 编译安装Redis

source /etc/init.d/functions
#Redis版本
Redis_version=redis-5.0.9
suffix=tar.gz
Redis=${Redis_version}.${suffix}
Password=123456

#redis源码下载地址
redis_url=http://download.redis.io/releases/${Redis}
#redis安装路径
redis_install_DIR=/apps/redis

# CPU数量
CPUS=`lscpu|grep "^CPU(s)"|awk '{print $2}'`
# 系统类型
os_type=`grep "^NAME" /etc/os-release |awk -F'"| ' '{print $2}'`
# 系统版本号
os_version=`awk -F'"' '/^VERSION_ID/{print $2}' /etc/os-release`

color () {
if [[ $2 -eq 0 ]];then
    echo -e "\e[1;32m$1\t\t\t\t\t\t[  OK  ]\e[0;m"
else
    echo $2
    echo -e "\e[1;31m$1\t\t\t\t\t\t[ FAILED ]\e[0;m"
fi
}


download_redis (){
# 安装依赖包
yum -y install gcc jemalloc-devel || { color "安装依赖包失败,请检查网络" 1 ;exit 1;}

cd /opt
if [ -e ${Redis} ];then
	color "Redis源码包已存在" 0
else
	color "开始下载Redis源码包" 0
	wget ${redis_url}
	if [ $? -ne 0 ];then
		color "下载Redis源码包失败,退出!" 1
		exit 1
	fi
fi
}


install_redis (){
# 解压源码包
tar xvf /opt/${Redis} -C /usr/local/src
ln -s /usr/local/src/${Redis_version} /usr/local/src/redis

# 编译安装
cd /usr/local/src/redis
make -j ${CPUS} install PREFIX=${redis_install_DIR}
if [ $? -ne 0 ];then
	color "redis 编译安装失败!" 1
	exit 1
else
	color "redis编译安装成功" 0
fi

ln -s ${redis_install_DIR}/bin/redis-* /usr/sbin/

# 添加用户
if id redis &> /dev/null;then
	color "redis用户已存在" 1
else
	useradd -r -s /sbin/nologin redis
	color "redis用户已创建完成" 0
fi
mkdir -p ${redis_install_DIR}/{etc,log,data,run}

#准备redis配置文件
cp redis.conf ${redis_install_DIR}/etc/
sed -i "s/bind 127.0.0.1/bind 0.0.0.0/" ${redis_install_DIR}/etc/redis.conf
sed -i "/# requirepass/a requirepass ${Password}" ${redis_install_DIR}/etc/redis.conf
sed -i "s@^dir .*\$@dir ${redis_install_DIR}\/data@" ${redis_install_DIR}/etc/redis.conf
sed -i "s@^logfile .*\$@logfile ${redis_install_DIR}\/log\/redis-6379.log@" ${redis_install_DIR}/etc/redis.conf
sed -i "s@^pidfile .*\$@pidfile ${redis_install_DIR}\/run\/redis-6379.pid@" ${redis_install_DIR}/etc/redis.conf

chown -R redis:redis ${redis_install_DIR}

cat >> /etc/sysctl.conf <<EOF
net.core.somaxconn = 1024
vm.overcommit_memory = 1
EOF
sysctl -p

echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.d/rc.local
chmod +x /etc/rc.d/rc.local
source /etc/rc.d/rc.local


# 准备service服务
cat > /usr/lib/systemd/system/redis.service <<EOF
[Unit]
Description=redis persistent key-value database
After=network.target

[Service]
ExecStart=${redis_install_DIR}/bin/redis-server ${redis_install_DIR}/etc/redis.conf --supervised systemd
ExecStop=/bin/kill -s QUIT \$MAINPID
Type=notify
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target
EOF

chown -R redis:redis ${redis_install_DIR}
systemctl daemon-reload
systemctl enable --now redis
systemctl is-active redis

if [ $? -ne 0 ];then
	color "redis服务启动失败!" 1
	exit 1
else
	color "redis服务启动成功" 0
	color "redis安装已完成" 0
fi
}


download_redis

install_redis

exit 0
  1. master 节点配置

    #修改redis.conf配置
    vim /apps/redis/etc/redis.conf
    bind 0.0.0.0
    masterauth "123456"
    requirepass "123456"
    
    #重启redis
    systemctl restart redis
    
  2. slave 节点配置

    #修改redis.conf配置
    vim /apps/redis/etc/redis.conf
    bind 0.0.0.0
    masterauth "123456"
    requirepass "123456"
    replicaof 10.0.0.7 6379
    
    #重启redis
    systemctl restart redis
    
  3. 状态查看

    master

    [root@master ~]# redis-cli -a 123456
    Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
    127.0.0.1:6379> info replication
    # Replication
    role:master
    connected_slaves:2
    slave0:ip=10.0.0.27,port=6379,state=online,offset=28,lag=1
    slave1:ip=10.0.0.17,port=6379,state=online,offset=28,lag=1
    master_replid:14883e4254918d97c50ec0f05c6b7b741e09cc59
    master_replid2:0000000000000000000000000000000000000000
    master_repl_offset:28
    second_repl_offset:-1
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:1
    repl_backlog_histlen:28
    127.0.0.1:6379> 
    
    

    slave1

    [root@slave1 ~]# redis-cli -a 123456
    Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
    127.0.0.1:6379> info replication
    # Replication
    role:slave
    master_host:10.0.0.7
    master_port:6379
    master_link_status:up
    master_last_io_seconds_ago:9
    master_sync_in_progress:0
    slave_repl_offset:154
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_replid:14883e4254918d97c50ec0f05c6b7b741e09cc59
    master_replid2:0000000000000000000000000000000000000000
    master_repl_offset:154
    second_repl_offset:-1
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:1
    repl_backlog_histlen:154
    127.0.0.1:6379> 
    

    slave2

    [root@slave2 ~]# redis-cli -a 123456
    Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
    127.0.0.1:6379> info replication
    # Replication
    role:slave
    master_host:10.0.0.7
    master_port:6379
    master_link_status:up
    master_last_io_seconds_ago:5
    master_sync_in_progress:0
    slave_repl_offset:210
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_replid:14883e4254918d97c50ec0f05c6b7b741e09cc59
    master_replid2:0000000000000000000000000000000000000000
    master_repl_offset:210
    second_repl_offset:-1
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:1
    repl_backlog_histlen:210
    127.0.0.1:6379> 
    

编辑哨兵配置文件

Sentinel实际上是一个特殊的redis服务器,有些redis指令支持,但很多指令并不支持.默认监听在26379/tcp端口。

哨兵可以不和Redis服务器部署在一起,但一般部署在一起。

cp /usr/local/src/redis/sentinel.conf /apps/redis/etc/redis-sentinel.conf
cd /apps/redis/etc/
#配置sentinel
[root@master etc]# grep "^[a-Z]" redis-sentinel.conf
bind 0.0.0.0
port 26379
daemonize yes
pidfile /apps/redis/run/redis-sentinel.pid
logfile /apps/redis/log/sentinel_26379.log
dir /apps/redis/data
sentinel monitor mymaster 10.0.0.7 6379 2
sentinel auth-pass mymaster 123456
sentinel down-after-milliseconds mymaster 3000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
sentinel deny-scripts-reconfig yes

#启动sentinel
[root@master etc]# redis-sentinel /apps/redis/etc/redis-sentinel.conf 
#查看sentinel配置信息
[root@master etc]# grep "^[a-Z]" redis-sentinel.conf
bind 0.0.0.0
port 26379
daemonize yes
pidfile /apps/redis/run/redis-sentinel.pid
logfile /apps/redis/log/sentinel_26379.log
dir /apps/redis/data

sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 10.0.0.7 6379 2
sentinel parallel-syncs mymaster 1
sentinel down-after-milliseconds mymaster 3000
sentinel auth-pass mymaster 123456
sentinel config-epoch mymaster 0
#以下内容为自动生成
sentinel myid c663d4b9db845d721cd6dccf608c7904d896b745      #myid必须唯一
protected-mode no
sentinel leader-epoch mymaster 0
sentinel known-replica mymaster 10.0.0.27 6379
sentinel known-replica mymaster 10.0.0.17 6379
sentinel known-sentinel mymaster 10.0.0.27 26379 66f276f274802c6f0243007a2be4b04001b9867e
sentinel known-sentinel mymaster 10.0.0.17 26379 5d3a6880bd134e211c77bef6bc408ab63a1fd3ac
sentinel current-epoch 0

配置sentinel服务

[root@shichu ~]# cat /lib/systemd/system/redis-sentinel.service
[Unit]
Description=Redis Sentinel
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
ExecStart=/apps/redis/bin/redis-sentinel /apps/redis/etc/redis-sentinel.conf --supervised systemd
ExecStop=/bin/kill -s QUIT $MAINPID
Type=notify
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target

启动sentinel服务

chown -R redis:redis /apps/redis
systemctl daemon-reload
systemctl enable --now redis-sentinel

sentinel配置参数说明

sentinel monitor mymaster 10.0.0.8 6379 2 # 指定当前mymaster集群中master服务器的地址和端口

2为法定人数限制(quorum),即有几个sentinel认为master down了就进行故障转移,一般此值是所有sentinel节点(一般总数是>=3的 奇数,如:3,5,7等)的一半以上的整数值,比如,总数是3,即3/2=1.5,取整为2,是master的ODOWN客观下线的依据

sentinel auth-pass mymaster 123456 #mymaster集群中master的密码,注意此行要在上面行的下面

sentinel down-after-milliseconds mymaster 30000 #(SDOWN)判断mymaster集群中所有节点的主观下线的时间,单位:毫秒,建议3000

sentinel parallel-syncs mymaster 1 #发生故障转移后,同时向新master同步数据的slave数量,数字越小总同步时间越长,但可以减轻新master的负载压力

sentinel failover-timeout mymaster 180000 #所有slaves指向新的master所需的超时时间,单位:毫秒

sentinel deny-scripts-reconfig yes #禁止修改脚本

[root@master etc]# ss -ntl
State       Recv-Q Send-Q                                  Local Address:Port                                                 Peer Address:Port            
LISTEN      0      100                                         127.0.0.1:25                                                              *:*                
LISTEN      0      511                                                 *:26379                                                           *:*                
LISTEN      0      511                                                 *:6379                                                            *:*                
LISTEN      0      128                                                 *:111                                                             *:*                
LISTEN      0      128                                                 *:22                                                              *:*                
LISTEN      0      100                                             [::1]:25                                                           [::]:*                
LISTEN      0      128                                              [::]:111                                                          [::]:*                
LISTEN      0      128                                              [::]:22  

模拟故障转移

[root@master etc]# systemctl stop redis
[root@master etc]# ss -ntl
State       Recv-Q Send-Q                                  Local Address:Port                                                 Peer Address:Port      
LISTEN      0      100                                         127.0.0.1:25                                                              *:*          
LISTEN      0      511                                                 *:26379                                                           *:*          
LISTEN      0      128                                                 *:111                                                             *:*          
LISTEN      0      128                                                 *:22                                                              *:*          
LISTEN      0      100                                             [::1]:25                                                           [::]:*          
LISTEN      0      128                                              [::]:111                                                          [::]:*          
LISTEN      0      128                                              [::]:22 
[root@master redis]# tail -f /apps/redis/log/sentinel_26379.log 
1491:X 11 Jul 2022 17:07:16.959 # +sdown master mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:17.044 # +odown master mymaster 10.0.0.7 6379 #quorum 2/2
1491:X 11 Jul 2022 17:07:17.044 # +new-epoch 4
1491:X 11 Jul 2022 17:07:17.044 # +try-failover master mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:17.045 # +vote-for-leader c663d4b9db845d721cd6dccf608c7904d896b745 4
1491:X 11 Jul 2022 17:07:17.048 # 5d3a6880bd134e211c77bef6bc408ab63a1fd3ac voted for c663d4b9db845d721cd6dccf608c7904d896b745 4
1491:X 11 Jul 2022 17:07:17.050 # 66f276f274802c6f0243007a2be4b04001b9867e voted for c663d4b9db845d721cd6dccf608c7904d896b745 4
1491:X 11 Jul 2022 17:07:17.102 # +elected-leader master mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:17.102 # +failover-state-select-slave master mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:17.205 # +selected-slave slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:17.205 * +failover-state-send-slaveof-noone slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:17.269 * +failover-state-wait-promotion slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:18.078 # +promoted-slave slave 10.0.0.27:6379 10.0.0.27 6379 @ mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:18.078 # +failover-state-reconf-slaves master mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:18.145 * +slave-reconf-sent slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:19.144 * +slave-reconf-inprog slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:19.144 * +slave-reconf-done slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:19.228 # -odown master mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:19.228 # +failover-end master mymaster 10.0.0.7 6379
1491:X 11 Jul 2022 17:07:19.228 # +switch-master mymaster 10.0.0.7 6379 10.0.0.27 6379        #可看出master节点已转移到10.0.0.27上
1491:X 11 Jul 2022 17:07:19.229 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.27 6379
1491:X 11 Jul 2022 17:07:19.229 * +slave slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379
1491:X 11 Jul 2022 17:07:22.276 # +sdown slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379

日志参数说明

+reset-master :主服务器已被重置。
+slave :一个新的从服务器已经被 Sentinel 识别并关联。
+failover-state-reconf-slaves :故障转移状态切换到了 reconf-slaves 状态。
+failover-detected :另一个 Sentinel 开始了一次故障转移操作,或者一个从服务器转换成了主服务器。
+slave-reconf-sent :领头(leader)的 Sentinel 向实例发送了 SLAVEOF 命令,为实例设置新的主服务器。
+slave-reconf-inprog :实例正在将自己设置为指定主服务器的从服务器,但相应的同步过程仍未完成。
+slave-reconf-done :从服务器已经成功完成对新主服务器的同步。
-dup-sentinel :对给定主服务器进行监视的一个或多个 Sentinel 已经因为重复出现而被移除 —— 当 Sentinel 实例重启的时候,就会出现这种情况。
+sentinel :一个监视给定主服务器的新 Sentinel 已经被识别并添加。
+sdown :给定的实例现在处于主观下线状态。
-sdown :给定的实例已经不再处于主观下线状态。
+odown :给定的实例现在处于客观下线状态。
-odown :给定的实例已经不再处于客观下线状态。
+new-epoch :当前的纪元(epoch)已经被更新。
+try-failover :一个新的故障迁移操作正在执行中,等待被大多数 Sentinel 选中(waiting to be elected by the majority)。
+elected-leader :赢得指定纪元的选举,可以进行故障迁移操作了。
+failover-state-select-slave :故障转移操作现在处于 select-slave 状态 —— Sentinel 正在寻找可以升级为主服务器的从服务器。
no-good-slave :Sentinel 操作未能找到适合进行升级的从服务器。Sentinel 会在一段时间之后再次尝试寻找合适的从服务器来进行升级,又或者直接放弃执行故障转移操作。
selected-slave :Sentinel 顺利找到适合进行升级的从服务器。
failover-state-send-slaveof-noone :Sentinel 正在将指定的从服务器升级为主服务器,等待升级功能完成。
failover-end-for-timeout :故障转移因为超时而中止,不过最终所有从服务器都会开始复制新的主服务器(slaves will eventually be configured to replicate with the new master anyway)。
failover-end :故障转移操作顺利完成。所有从服务器都开始复制新的主服务器了。
+switch-master :配置变更,主服务器的 IP 和地址已经改变。 这是绝大多数外部用户都关心的信息。
+tilt :进入 tilt 模式。
-tilt :退出 tilt 模式。

四、简述 redis 集群的实现原理

Redis Cluster特点

Redis cluster 架构

image.png

五、基于 redis5 的 redis cluster 部署

官方文档:https://redis.io/topics/cluster-tutorial

创建Redis Cluster准备条件

部署redis cluster

1. 安装redis

修改redis配置

[root@node1 etc]# cat redis.conf 
...
bind 0.0.0.0
masterauth 123456   #建议配置,否则后期的master和slave主从复制无法成功,还需再配置
requirepass 123456
cluster-enabled yes #取消此行注释,必须开启集群,开启后redis 进程会有cluster显示
cluster-config-file nodes-6379.conf #取消此行注释,此为集群状态文件,记录主从关系及slot范围信息,由redis cluster 集群自动创建和维护
cluster-require-full-coverage no   #默认值为yes,设为no可以防止一个节点不可用导致整个cluster不可能
...

[root@node1 etc]#systemctl enable --now redis

2. 查看当前redis状态

#查看端口
[root@node1 ~]# ss -ntl
State      Recv-Q Send-Q                Local Address:Port                               Peer Address:Port        
LISTEN     0      511                               *:6379                                          *:*            
LISTEN     0      128                               *:111                                           *:*            
LISTEN     0      128                               *:22                                            *:*            
LISTEN     0      100                       127.0.0.1:25                                            *:*            
LISTEN     0      511                               *:16379                                         *:*            
LISTEN     0      128                            [::]:111                                        [::]:*            
LISTEN     0      128                            [::]:22                                         [::]:*            
LISTEN     0      100                           [::1]:25                                         [::]:*

#查看进程有[cluster]状态
[root@node1 ~]# ps aux|grep redis
redis     24754  0.2  0.3 153996  3172 ?        Ssl  21:28   0:02 /apps/redis/bin/redis-server 0.0.0.0:6379 [cluster]
root      24822  0.0  0.0 112812   980 pts/0    R+   21:44   0:00 grep --color=auto redis

3. 创建集群

[root@node1 ~]# redis-cli -a 123456 --cluster create 10.0.0.7:6379 10.0.0.17:6379 10.0.0.27:6379 10.0.0.37:6379 \
10.0.0.47:6379 10.0.0.57:6379 --cluster-replicas 1
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.0.0.47:6379 to 10.0.0.7:6379
Adding replica 10.0.0.57:6379 to 10.0.0.17:6379
Adding replica 10.0.0.37:6379 to 10.0.0.27:6379
M: 4ccee0bb38763061cf567995bcdd9289cea9cfec 10.0.0.7:6379	#带M的为master
   slots:[0-5460] (5461 slots) master				#当前master的槽位起始和结束位
M: 12fdc235442ed40a838e77b246025799b4b3357b 10.0.0.17:6379
   slots:[5461-10922] (5462 slots) master
M: 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 10.0.0.27:6379
   slots:[10923-16383] (5461 slots) master
S: 59eac16e6e2992cdfffe97934d7409afe21d2a9a 10.0.0.37:6379	#带S的slave
   replicates 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7
S: 15e2e2eccefd453f1a154fc42c6a9b030acacfb2 10.0.0.47:6379
   replicates 4ccee0bb38763061cf567995bcdd9289cea9cfec
S: 8c3b8146ce75ab277958937d4e79e893a15c50e2 10.0.0.57:6379
   replicates 12fdc235442ed40a838e77b246025799b4b3357b
Can I set the above configuration? (type 'yes' to accept): yes	#输入yes自动创建集群
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
...
>>> Performing Cluster Check (using node 10.0.0.7:6379)
M: 4ccee0bb38763061cf567995bcdd9289cea9cfec 10.0.0.7:6379
   slots:[0-5460] (5461 slots) master				#已经分配的槽位
   1 additional replica(s)					#分配了一个slave
S: 59eac16e6e2992cdfffe97934d7409afe21d2a9a 10.0.0.37:6379
   slots: (0 slots) slave					#slave没有分配槽位
   replicates 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7		#对应的master的10.0.0.27的ID
M: 12fdc235442ed40a838e77b246025799b4b3357b 10.0.0.17:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 8c3b8146ce75ab277958937d4e79e893a15c50e2 10.0.0.57:6379
   slots: (0 slots) slave
   replicates 12fdc235442ed40a838e77b246025799b4b3357b		#对应的master的10.0.0.17的ID
M: 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 10.0.0.27:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 15e2e2eccefd453f1a154fc42c6a9b030acacfb2 10.0.0.47:6379
   slots: (0 slots) slave
   replicates 4ccee0bb38763061cf567995bcdd9289cea9cfec		#对应的master的10.0.0.7的ID
[OK] All nodes agree about slots configuration.		#所有节点槽位分配完成
>>> Check for open slots...				#检查打开的槽位
>>> Check slots coverage...				#检查插槽覆盖范围
[OK] All 16384 slots covered.				 #所有槽位(16384个)分配完成	
[root@node1 ~]# 


观察以上结果,可以看到3组master/slave

master:10.0.0.7-->slave:10.0.0.47
master:10.0.0.17-->slave:10.0.0.57
master:10.0.0.27-->slave:10.0.0.37

4. 查看主从状态

node1(10.0.0.7)

[root@node1 ~]# redis-cli -a 123456 -c info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master
connected_slaves:1
slave0:ip=10.0.0.47,port=6379,state=online,offset=1008,lag=1
master_replid:3493f56b2f698cea41c90cb0a41e1562b5821636
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1008
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1008

node2(10.0.0.17)

[root@node2 etc]# redis-cli -a 123456 -c info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master
connected_slaves:1
slave0:ip=10.0.0.57,port=6379,state=online,offset=1008,lag=0
master_replid:269568d06cb92748f583d6ea900e7563b1739f54
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1008
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1008

node3(10.0.0.27)

[root@node3 ~]# redis-cli -a 123456 -c info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master
connected_slaves:1
slave0:ip=10.0.0.37,port=6379,state=online,offset=1008,lag=0
master_replid:826e716b92aa4e287013a33f9786e529be2fff71
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1008
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1008

node4(10.0.0.37)

[root@node4 ~]# redis-cli -a 123456 -c info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:10.0.0.27
master_port:6379
master_link_status:up
master_last_io_seconds_ago:6
master_sync_in_progress:0
slave_repl_offset:1008
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:826e716b92aa4e287013a33f9786e529be2fff71
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1008
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1008

node5(10.0.0.47)

[root@node5 ~]# redis-cli -a 123456 -c info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:10.0.0.7
master_port:6379
master_link_status:up
master_last_io_seconds_ago:4
master_sync_in_progress:0
slave_repl_offset:1008
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:3493f56b2f698cea41c90cb0a41e1562b5821636
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1008
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1008

node6(10.0.0.57)

[root@node6 ~]# redis-cli -a 123456 -c info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:10.0.0.17
master_port:6379
master_link_status:up
master_last_io_seconds_ago:10
master_sync_in_progress:0
slave_repl_offset:1008
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:269568d06cb92748f583d6ea900e7563b1739f54
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1008
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:1008

查看指定master节点的slave节点信息

#获取所有节点信息
[root@node1 ~]# redis-cli -a 123456 cluster nodes 2>/dev/null
59eac16e6e2992cdfffe97934d7409afe21d2a9a 10.0.0.37:6379@16379 slave 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 0 1657554345797 4 connected
4ccee0bb38763061cf567995bcdd9289cea9cfec 10.0.0.7:6379@16379 myself,master - 0 1657554345000 1 connected 0-5460
12fdc235442ed40a838e77b246025799b4b3357b 10.0.0.17:6379@16379 master - 0 1657554343746 2 connected 5461-10922
8c3b8146ce75ab277958937d4e79e893a15c50e2 10.0.0.57:6379@16379 slave 12fdc235442ed40a838e77b246025799b4b3357b 0 1657554344770 6 connected
16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 10.0.0.27:6379@16379 master - 0 1657554344000 3 connected 10923-16383
15e2e2eccefd453f1a154fc42c6a9b030acacfb2 10.0.0.47:6379@16379 slave 4ccee0bb38763061cf567995bcdd9289cea9cfec 0 1657554344000 5 connected

#查看master节点ID对应的slave节点信息,16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7为10.0.0.27 master节点ID
[root@node1 ~]# redis-cli -a 123456 cluster slaves 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 2>/dev/null
1) "59eac16e6e2992cdfffe97934d7409afe21d2a9a 10.0.0.37:6379@16379 slave 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 0 1657554778157 4 connected"

5. 验证集群状态

[root@node1 ~]# redis-cli -a 123456 cluster info
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6		#6个节点
cluster_size:3			#3组集群
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:3639
cluster_stats_messages_pong_sent:3625
cluster_stats_messages_sent:7264
cluster_stats_messages_ping_received:3620
cluster_stats_messages_pong_received:3639
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:7264

#查看任意节点的集群状态
[root@node1 ~]# redis-cli -a 123456 --cluster info 10.0.0.27:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.27:6379 (16bb6630...) -> 0 keys | 5461 slots | 1 slaves.
10.0.0.17:6379 (12fdc235...) -> 0 keys | 5462 slots | 1 slaves.
10.0.0.7:6379 (4ccee0bb...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 0 keys in 3 masters.
0.00 keys per slot on average.

查看集群node对应关系

#获取集群中所有节点
[root@node1 ~]# redis-cli -a 123456 cluster nodes
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
59eac16e6e2992cdfffe97934d7409afe21d2a9a 10.0.0.37:6379@16379 slave 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 0 1657556036000 4 connected
4ccee0bb38763061cf567995bcdd9289cea9cfec 10.0.0.7:6379@16379 myself,master - 0 1657556036000 1 connected 0-5460
12fdc235442ed40a838e77b246025799b4b3357b 10.0.0.17:6379@16379 master - 0 1657556036033 2 connected 5461-10922
8c3b8146ce75ab277958937d4e79e893a15c50e2 10.0.0.57:6379@16379 slave 12fdc235442ed40a838e77b246025799b4b3357b 0 1657556038079 6 connected
16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 10.0.0.27:6379@16379 master - 0 1657556037057 3 connected 10923-16383
15e2e2eccefd453f1a154fc42c6a9b030acacfb2 10.0.0.47:6379@16379 slave 4ccee0bb38763061cf567995bcdd9289cea9cfec 0 1657556036000 5 connected


[root@node1 ~]# redis-cli -a 123456 --cluster check 10.0.0.27:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.27:6379 (16bb6630...) -> 0 keys | 5461 slots | 1 slaves.
10.0.0.17:6379 (12fdc235...) -> 0 keys | 5462 slots | 1 slaves.
10.0.0.7:6379 (4ccee0bb...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 0 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.27:6379)
M: 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7 10.0.0.27:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 59eac16e6e2992cdfffe97934d7409afe21d2a9a 10.0.0.37:6379
   slots: (0 slots) slave
   replicates 16bb6630a6a09bd4b24d7a203ecaa38b9a4360a7
S: 8c3b8146ce75ab277958937d4e79e893a15c50e2 10.0.0.57:6379
   slots: (0 slots) slave
   replicates 12fdc235442ed40a838e77b246025799b4b3357b
M: 12fdc235442ed40a838e77b246025799b4b3357b 10.0.0.17:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
M: 4ccee0bb38763061cf567995bcdd9289cea9cfec 10.0.0.7:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: 15e2e2eccefd453f1a154fc42c6a9b030acacfb2 10.0.0.47:6379
   slots: (0 slots) slave
   replicates 4ccee0bb38763061cf567995bcdd9289cea9cfec
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

验证集群写入

image.png

#连接节点,可能会出现槽位不在当前node所以无法写入
[root@shichu ~]# redis-cli -a 123456 -h 10.0.0.7
10.0.0.7:6379> set key1 v1
(error) MOVED 9189 10.0.0.17:6379
#需要连接指定node,才可写入
[root@shichu ~]# redis-cli -a 123456 -h 10.0.0.17
10.0.0.17:6379> set key1 values1
OK
10.0.0.17:6379> get key1
"values1"


#使用选项-c以集群方式连接,连接至集群中任意一节点均可
[root@shichu ~]# redis-cli -a 123456 -h 10.0.0.7 -c
10.0.0.7:6379> set key1 v1
-> Redirected to slot [9189] located at 10.0.0.17:6379
OK
10.0.0.17:6379> get key1
"v1"

六、部署 Zabbix 监控

官网下载地址:https://www.zabbix.com/cn/download

官网文档:https://www.zabbix.com/manuals

https://cdn.zabbix.com/zabbix/sources/stable/5.0/zabbix-5.0.25.tar.gz

使用LNMP编译安装Zabbix 5

L:Linux(CentOS7)https://mirrors.aliyun.com/centos/7/isos/x86_64/
N:Nginx(1.18.0) https://nginx.org/en/download.html
M:MySQL(8.0.19) https://dev.mysql.com/downloads/mysql/
P:PHP(7.4.11)   http://php.net/downloads.php
Zabbix (5.0.25)  https://cdn.zabbix.com/zabbix/sources/
graph LR A[Client] B[Linux</br>Nginx</br>PHP</br>Zabbix</br>10.0.0.100] C[Linux</br>MySQL</br>10.0.0.200] A--->B--->C

1. 安装MySQL

参考:基于CentOS 7 二进制安装Mysql 8.0

安装完成后创建zabbix用户

mysql -uroot -p123456 -e "create database zabbix character set utf8 collate utf8_bin;"
mysql -uroot -p123456 -e "create user zabbix@'10.0.0.%' identified by '123456'"
mysql -uroot -p123456 -e "grant all privileges on zabbix.* to zabbix@'10.0.0.%'"
mysql -uroot -p123456 -e "use mysql;\
alter user zabbix@'10.0.0.%' identified with mysql_native_password by '123456';\
flush privileges;"

2. 安装Nginx

参考:基于CentOS 7 编译安装Nginx 1.18[^1]

3. 安装PHP

参考:基于CentOS 7 编译安装PHP 7.4[^2]

4. 安装Zabbix

安装zabbix_server

#!/bin/bash
# 编译安装Zabbix

source /etc/init.d/functions
#Zabbix版本
Zabbix_Version=zabbix-5.0.25
Suffix=tar.gz
Zabbix=${Zabbix_Version}.${Suffix}

Password=123456

#Zabbix源码下载地址
Zabbix_url=https://cdn.zabbix.com/zabbix/sources/stable/5.0/zabbix-5.0.25.tar.gz

#Zabbix安装路径
Zabbix_install_DIR=/apps/zabbix

# CPU数量
CPUS=`lscpu|grep "^CPU(s)"|awk '{print $2}'`
# 系统类型
os_type=`grep "^NAME" /etc/os-release |awk -F'"| ' '{print $2}'`
# 系统版本号
os_version=`awk -F'"' '/^VERSION_ID/{print $2}' /etc/os-release`

color () {
if [[ $2 -eq 0 ]];then
    echo -e "\e[1;32m$1\t\t\t\t\t\t[  OK  ]\e[0;m"
else
    echo $2
    echo -e "\e[1;31m$1\t\t\t\t\t\t[ FAILED ]\e[0;m"
fi
}


install_Zabbix (){
#----------------------------下载源码包-----------------------------
cd /opt
if [ -e ${Zabbix} ];then
	color "Zabbix源码包已存在" 0
else
	color "开始下载Zabbix源码包" 0
	wget ${Zabbix_url}
	if [ $? -ne 0 ];then
		color "下载Zabbix源码包失败,退出!" 1
		exit 1
	fi
fi


#----------------------------解压源码包-----------------------------
color "开始解压源码包" 0
tar -zxvf /opt/${Zabbix} -C /usr/local/src
ln -s /usr/local/src/${Zabbix_Version} /usr/local/src/zabbix


#----------------------------安装依赖包-------------------------------- 
color "开始安装依赖包" 0

#wget https://dev.mysql.com/get/mysql80-community-release-el7-6.noarch.rpm

yum install -y gcc libxml2-devel net-snmp net-snmp-devel curl curl-devel php-gd php-bcmath php-xml \
php-mbstring mariadb mariadb-devel OpenIPMI-devel libevent-devel java-1.8.0-openjdk-devel \
|| { color "安装依赖包失败,请检查网络" 1 ;exit 1;}


#---------------------------创建Zabbix用户---------------------------
if id zabbix &> /dev/null ;then
	color "Zabbix用户已存在" 1
else
	groupadd --system zabbix
	useradd --system -g zabbix -d /usr/lib/zabbix -s /sbin/nologin -c "Zabbix Monitoring System" zabbix
	color "Zabbix用户已创建完成" 0
fi

#---------------------------编译---------------------------
color "开始编译zabbix" 0
cd /usr/local/src/zabbix
./configure --prefix=${Zabbix_install_DIR} \
--enable-server \
--enable-agent \
--with-mysql \
--with-net-snmp \
--with-libcurl \
--with-libxml2 \
--with-openipmi \
--enable-proxy \
--enable-java

make -j ${CPUS} install
if [ $? -ne 0 ];then
	color "Zabbix 编译安装失败!" 1
	exit 1
else
	color "Zabbix编译安装成功" 0
fi

#复制web界面相关文件
mkdir -pv /home/nginx/zabbix
cp -rf /usr/local/src/zabbix/ui/* /home/nginx/zabbix/
chown nginx:nginx -R /home/nginx/zabbix

/apps/zabbix/sbin/zabbix_server -c /apps/zabbix/etc/zabbix_server.conf
if [ $? -eq 0 ];then
	color "zabbix_server测试能正常启动" 0
	pkill zabbix
fi

color "zabbix安装完成" 0
}

install_Zabbix

exit 0

修改配置文件

  1. 修改/apps/nginx/conf/nginx.conf配置文件

    worker_processes   1;
    pid 		logs/nginx.pid;
    events {
    		worker_connections	1024;
    }
    http {
    	include			mime.types;
    	default_type	application/octet-stream;
    	sendfile		on;
    	keepalive_timeout	65;
    	server {
    		listen		80;
    		server_name	10.0.0.100;				#指定主机名
    		server_tokens off;						#隐藏nginx版本信息
    
    		location / {
    			root	/home/nginx/zabbix;				#指定数据目录
    			index	index.php index.html index.htm;			#指定默认主页
    		}
    
    		error_page	500 502 503 504 /50x.html;
    
    		location = /50x.html {
    			root	html;
    		}
    
    		location ~ \.php$ {						#实现php-fpm
    			root		/home/nginx/zabbix;
    			fastcgi_pass	127.0.0.1:9000;
    			fastcgi_index	index.php;
    			fastcgi_param	SCRIPT_FILENAME	$document_root$fastcgi_script_name;
    			include		fastcgi_params;
    			fastcgi_hide_header X-Powered-By;			#隐藏php版本信息
    		}
    
    		location ~ ^/(ping|pm_status)$ {				#实现状态页
    			include		fastcgi_params;
    			fastcgi_pass	127.0.0.1:9000;
    			fastcgi_param	PATH_TRANSLATED	$document_root$fastcgi_script_name;
    		}
    	}
    }
    
  2. 修改php配置文件

    #修改/etc/php.ini
    sed -i -e "/memory_limit/c memory_limit = 256M" \
    -e "/post_max_size/c post_max_size = 30M" \
    -e "/upload_max_filesize/c upload_max_filesize = 20M" \
    -e "/max_execution_time/c max_execution_time = 300" \
    -e "/max_input_time/c max_input_time = 300" \
    -e "/;date.timezone/c date.timezone = Asia/Shanghai" \
    /etc/php.ini
    
    #修改/apps/php/etc/php-fpm.d/www.conf
    sed -i -e "/user = www/c user = nginx" \
    -e "/group = www/c group = nginx" /apps/php/etc/php-fpm.d/www.conf
    

    重启服务

    systemctl restart nginx php-fpm
    
  3. 导入mysql数据

    mysql -uzabbix -p123456 -h10.0.0.200 zabbix < /usr/local/src/zabbix/database/mysql/schema.sql
    mysql -uzabbix -p123456 -h10.0.0.200 zabbix < /usr/local/src/zabbix/database/mysql/images.sql
    mysql -uzabbix -p123456 -h10.0.0.200 zabbix < /usr/local/src/zabbix/database/mysql/data.sql
    
  4. 修改zabbix配置文件

    sed -i "/# DBHost=localhost/aDBHost=10.0.0.200" /apps/zabbix/etc/zabbix_server.conf
    sed -i "/# DBPassword=/aDBPassword=123456" /apps/zabbix/etc/zabbix_server.conf
    sed -i "/# DBPort=/aDBPort=3306" /apps/zabbix/etc/zabbix_server.conf
    sed -i "/StatsAllowedIP=127.0.0.1/c #StatsAllowedIP=127.0.0.1" /apps/zabbix/etc/zabbix_server.conf
    
  5. 设置zabbix_server启动服务脚本

    cat /lib/systemd/system/zabbix-server.service

    [Unit]
    Description=Zabbix Server
    After=syslog.target
    After=network.target
    
    [Service]
    Environment="CONFFILE=/apps/zabbix/etc/zabbix_server.conf"
    EnvironmentFile=-/etc/default/zabbix-server
    Type=forking
    Restart=on-failure
    PIDFile=/tmp/zabbix_server.pid
    KillMode=control-group
    ExecStart=/apps/zabbix/sbin/zabbix_server -c $CONFFILE
    ExecStop=/bin/kill -SIGTERM $MAINPID
    RestartSec=10s
    TimeoutStopSec=5
    
    [Install]
    WantedBy=multi-user.target
    

    启动服务

    systemctl daemon-reload
    systemctl enable --now zabbix-server
    
  6. 设置zabbix_agent启动服务脚本

    cat /lib/systemd/system/zabbix-agent.service

    [Unit]
    Description=Zabbix Agent
    After=syslog.target
    After=network.target
    
    [Service]
    Environment="CONFFILE=/apps/zabbix/etc/zabbix_agentd.conf"
    EnvironmentFile=-/etc/default/zabbix-agent
    Type=forking
    Restart=on-failure
    PIDFile=/tmp/zabbix_agentd.pid
    KillMode=control-group
    ExecStart=/apps/zabbix/sbin/zabbix_agentd -c $CONFFILE
    ExecStop=/bin/kill -SIGTERM $MAINPID
    RestartSec=10s
    User=zabbix
    Group=zabbix
    
    [Install]
    WantedBy=multi-user.target
    

    启动服务

    systemctl daemon-reload
    systemctl enable --now zabbix-agent
    
  7. 查看状态

    • 10050、10051端口启动正常
    #可看到10050(agent)、10051(server)端口
    [root@shichu apps]# ss -ntl
    State      Recv-Q Send-Q               Local Address:Port                              Peer Address:Port          
    LISTEN     0      128                              *:22                                           *:*              
    LISTEN     0      100                      127.0.0.1:25                                           *:*              
    LISTEN     0      128                              *:10050                                        *:*              
    LISTEN     0      128                              *:10051                                        *:*              
    LISTEN     0      128                      127.0.0.1:9000                                         *:*              
    LISTEN     0      128                              *:111                                          *:*              
    LISTEN     0      128                              *:80                                           *:*              
    LISTEN     0      128                           [::]:22                                        [::]:*              
    LISTEN     0      100                          [::1]:25                                        [::]:*              
    LISTEN     0      128                           [::]:111                                       [::]:*
    
    • zabbix-sever服务状态
    [root@shichu apps]# systemctl status zabbix-server
    ● zabbix-server.service - Zabbix Server
       Loaded: loaded (/usr/lib/systemd/system/zabbix-server.service; disabled; vendor preset: disabled)
       Active: active (running) since Thu 2022-07-14 00:47:09 CST; 52s ago
      Process: 8346 ExecStop=/bin/kill -SIGTERM $MAINPID (code=exited, status=0/SUCCESS)
      Process: 8352 ExecStart=/apps/zabbix/sbin/zabbix_server -c $CONFFILE (code=exited, status=0/SUCCESS)
     Main PID: 8360 (zabbix_server)
       CGroup: /system.slice/zabbix-server.service
               ├─8360 /apps/zabbix/sbin/zabbix_server -c /apps/zabbix/etc/zabbix_server.conf
               ├─8362 /apps/zabbix/sbin/zabbix_server: configuration syncer [synced configuration in 0.059399 sec, idle 6...
               ├─8363 /apps/zabbix/sbin/zabbix_server: alert manager #1 [sent 0, failed 0 alerts, idle 5.027609 sec durin...
               ├─8364 /apps/zabbix/sbin/zabbix_server: alerter #1 started
               ├─8365 /apps/zabbix/sbin/zabbix_server: alerter #2 started
               ├─8366 /apps/zabbix/sbin/zabbix_server: alerter #3 started
               ├─8367 /apps/zabbix/sbin/zabbix_server: preprocessing manager #1 [queued 0, processed 11 values, idle 5.00...
               ├─8368 /apps/zabbix/sbin/zabbix_server: preprocessing worker #1 started
               ├─8369 /apps/zabbix/sbin/zabbix_server: preprocessing worker #2 started
               ├─8370 /apps/zabbix/sbin/zabbix_server: preprocessing worker #3 started
               ├─8371 /apps/zabbix/sbin/zabbix_server: lld manager #1 [processed 0 LLD rules, idle 5.008702sec during 5.0...
               ├─8372 /apps/zabbix/sbin/zabbix_server: lld worker #1 started
               ├─8373 /apps/zabbix/sbin/zabbix_server: lld worker #2 started
               ├─8374 /apps/zabbix/sbin/zabbix_server: housekeeper [startup idle for 30 minutes]
               ├─8375 /apps/zabbix/sbin/zabbix_server: timer #1 [updated 0 hosts, suppressed 0 events in 0.001868 sec, id...
               ├─8376 /apps/zabbix/sbin/zabbix_server: http poller #1 [got 0 values in 0.001502 sec, idle 5 sec]
               ├─8377 /apps/zabbix/sbin/zabbix_server: discoverer #1 [processed 0 rules in 0.004759 sec, idle 60 sec]
               ├─8378 /apps/zabbix/sbin/zabbix_server: history syncer #1 [processed 0 values, 0 triggers in 0.000050 sec,...
               ├─8379 /apps/zabbix/sbin/zabbix_server: history syncer #2 [processed 0 values, 0 triggers in 0.000175 sec,...
               ├─8380 /apps/zabbix/sbin/zabbix_server: history syncer #3 [processed 0 values, 0 triggers in 0.000029 sec,...
               ├─8381 /apps/zabbix/sbin/zabbix_server: history syncer #4 [processed 0 values, 0 triggers in 0.000019 sec,...
               ├─8382 /apps/zabbix/sbin/zabbix_server: escalator #1 [processed 0 escalations in 0.004440 sec, idle 3 sec]...
               ├─8383 /apps/zabbix/sbin/zabbix_server: proxy poller #1 [exchanged data with 0 proxies in 0.000028 sec, id...
               ├─8384 /apps/zabbix/sbin/zabbix_server: self-monitoring [processed data in 0.000016 sec, idle 1 sec]
               ├─8385 /apps/zabbix/sbin/zabbix_server: task manager [processed 0 task(s) in 0.000836 sec, idle 5 sec]
               ├─8386 /apps/zabbix/sbin/zabbix_server: poller #1 [got 0 values in 0.000050 sec, idle 1 sec]
               ├─8387 /apps/zabbix/sbin/zabbix_server: poller #2 [got 0 values in 0.000048 sec, idle 1 sec]
               ├─8388 /apps/zabbix/sbin/zabbix_server: poller #3 [got 1 values in 0.001602 sec, idle 1 sec]
               ├─8389 /apps/zabbix/sbin/zabbix_server: poller #4 [got 0 values in 0.000019 sec, idle 1 sec]
               ├─8390 /apps/zabbix/sbin/zabbix_server: poller #5 [got 0 values in 0.001402 sec, idle 1 sec]
               ├─8391 /apps/zabbix/sbin/zabbix_server: unreachable poller #1 [got 0 values in 0.000039 sec, idle 5 sec]
               ├─8392 /apps/zabbix/sbin/zabbix_server: trapper #1 [processed data in 0.000000 sec, waiting for connection...
               ├─8393 /apps/zabbix/sbin/zabbix_server: trapper #2 [processed data in 0.000000 sec, waiting for connection...
               ├─8394 /apps/zabbix/sbin/zabbix_server: trapper #3 [processed data in 0.000000 sec, waiting for connection...
               ├─8395 /apps/zabbix/sbin/zabbix_server: trapper #4 [processed data in 0.000000 sec, waiting for connection...
               ├─8396 /apps/zabbix/sbin/zabbix_server: trapper #5 [processed data in 0.000000 sec, waiting for connection...
               ├─8397 /apps/zabbix/sbin/zabbix_server: icmp pinger #1 [got 0 values in 0.000020 sec, idle 5 sec]
               └─8398 /apps/zabbix/sbin/zabbix_server: alert syncer [queued 0 alerts(s), flushed 0 result(s) in 0.001557 ...
    
    Jul 14 00:47:08 shichu systemd[1]: Starting Zabbix Server...
    Jul 14 00:47:09 shichu systemd[1]: Started Zabbix Server.
    
    • zabbix-agent服务状态

      [root@shichu apps]# systemctl status zabbix-agent
      ● zabbix-agent.service - Zabbix Agent
         Loaded: loaded (/usr/lib/systemd/system/zabbix-agent.service; enabled; vendor preset: disabled)
         Active: active (running) since Thu 2022-07-14 00:47:09 CST; 58s ago
        Process: 8349 ExecStart=/apps/zabbix/sbin/zabbix_agentd -c $CONFFILE (code=exited, status=0/SUCCESS)
       Main PID: 8353 (zabbix_agentd)
         CGroup: /system.slice/zabbix-agent.service
                 ├─8353 /apps/zabbix/sbin/zabbix_agentd -c /apps/zabbix/etc/zabbix_agentd.conf
                 ├─8354 /apps/zabbix/sbin/zabbix_agentd: collector [idle 1 sec]
                 ├─8355 /apps/zabbix/sbin/zabbix_agentd: listener #1 [waiting for connection]
                 ├─8356 /apps/zabbix/sbin/zabbix_agentd: listener #2 [waiting for connection]
                 ├─8357 /apps/zabbix/sbin/zabbix_agentd: listener #3 [waiting for connection]
                 └─8358 /apps/zabbix/sbin/zabbix_agentd: active checks #1 [idle 1 sec]
      
      Jul 14 00:47:08 shichu systemd[1]: Starting Zabbix Agent...
      Jul 14 00:47:09 shichu systemd[1]: Started Zabbix Agent.
      

启动

5. 配置Web界面

初始化设置

浏览器访问本地IP(10.0.0.100)

image.png

image.png

image.png

image.png

需要手动下载配置文件上传至zabbix sever的/home/nginx/zabbix/conf/目录下

image.png

image.png

默认用户名:Admin	#注意A是大写
密码:zabbix

image.png

image.png

优化设置

设置中文菜单

image.png

显示中文

image.png

解决监控项乱码

image.png

image.png

具体路径为:/home/nginx/zabbix/assets/fonts

image.png

vim /home/nginx/zabbix/include/defines.inc.php
#修改如下两处即可
//define('ZBX_GRAPH_FONT_NAME',     'DejaVuSans'); // font file name
define('ZBX_GRAPH_FONT_NAME',       'simkai'); // font file name 


#define('ZBX_FONT_NAME', 'DejaVuSans');
define('ZBX_FONT_NAME', 'simkai');

字体自动生效,无需重启zabbix及nginx服务

image.png

七、实现 Nginx、Mysql 的监控

flowchart TB zabbix[Zabbix Server</br>10.0.0.100] mysql-m[Master</br>10.0.0.17] mysql-s[Slave</br>10.0.0.27] nginx[Nginx</br>10.0.0.7] subgraph Mysql mysql-m<-->mysql-s end zabbix--->nginx zabbix--->Mysql

1. 安装zabbix agent

2. 实现监控Nginx

  1. 准备nginx状态页
#添加nginx状态配置
[root@nginx ~]# cat /etc/nginx/nginx.conf
#在server块中添加状态页信息
...
        location /nginx_status {
            stub_status;
            allow 10.0.0.0/24;
            allow 127.0.0.1;
        }
  1. 准备nginx监控脚本
[root@nginx etc]# cat /etc/zabbix_agentd.d/nginx_status.sh
#!/bin/bash 

nginx_status_fun(){			#函数内容
	NGINX_PORT=$1			#端口,函数的第一个参数是脚本的第二个参数,即脚本的第二个参数是端口号
	NGINX_COMMAND=$2 		#命令,函数的第二个参数是脚本的第三个参数,即脚本的第三个参数是命令
	nginx_active(){			#获取nginx_active数量,以下相同,这是开启了nginx状态但是只能从本机看到
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status" 2>/dev/null| grep 'Active' | awk '{print $NF}'
		}
	nginx_reading(){		#获取状态的数量
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status" 2>/dev/null| grep 'Reading' | awk '{print $2}'
		}
	nginx_writing(){
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status" 2>/dev/null| grep 'Writing' | awk '{print $4}'
		}
	nginx_waiting(){
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status" 2>/dev/null| grep 'Waiting' | awk '{print $6}'
		}
	nginx_accepts(){
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status" 2>/dev/null| awk NR==3 | awk '{print $1}'
		}
	nginx_handled(){
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status" 2>/dev/null| awk NR==3 | awk '{print $2}'
		}
	nginx_requests(){
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status" 2>/dev/null| awk NR==3 | awk '{print $3}'
		}
  	case $NGINX_COMMAND in
		active)
			nginx_active;
			;;
		reading)
			nginx_reading;
			;;
		writing)
			nginx_writing;
			;;
		waiting)
			nginx_waiting;
			;;
		accepts)
			nginx_accepts;
			;;
		handled)
			nginx_handled;
			;;
		requests)
			nginx_requests;
		esac 
}

main(){							#主函数内容
	case $1 in
		nginx_status)				#分支结构,用于判断用户的输入而进行响应的操作
			nginx_status_fun $2 $3;		#当输入nginx_status就调用nginx_status_fun,并传递第二和第三个参数
			;;
		status)					#获取状态码
			curl -I -s http://127.0.0.1/nginx_status 2>/dev/null | awk 'NR==1{print $2}';
	            	;;				# -I仅输出HTTP请求头,-s不输出任何东西
		*)					#其他的输入打印帮助信息
			echo $"Usage: $0 {nginx_status key}"
	esac
}

main $1 $2 $3
  1. 添加zabbix agent自定义监控项(通过子配置文件方式)

    • 创建子配置文件
    [root@nginx etc]# cat /etc/zabbix_agentd.conf.d/nginx_monitor.conf 
    UserParameter=nginx_status[*],/etc/zabbix_agentd.d/nginx_status.sh "$1" "$2" "$3"
    
  2. 验证测试

#重启服务
systemctl restart nginx zabbix-agent

#本地获取所有nginx状态
[root@nginx zabbix_agentd.d]# curl 127.0.0.1/nginx_status
Active connections: 1 
server accepts handled requests
 21 21 21 
Reading: 0 Writing: 1 Waiting: 0 

#本机获取active连接数
[root@nginx zabbix_agentd.d]# /etc/zabbix_agentd.d/nginx_status.sh nginx_status 80 active
1

#server获取active连接数
[root@zabbix ~]# /apps/zabbix/bin/zabbix_get -s 10.0.0.7 -p 10050 -k "nginx_status["nginx_status",80,"active"]"
1
  1. 导入监控模板

    模板参考:nginx-template.xml

    image.png

    关联模板

    image.png

    查看导入的nginx模板监控项

    image.png

  2. 验证监控

    image.png

3. 实现监控Mysql

1)搭建mysql主从

master(10.0.0.17)

#修改配置
vim /etc/my.cnf.d/server.cnf
[mysqld]
bind=0.0.0.0
server-id=17
log-bin

#重启数据库
systemctl restart mariadb


#创建复制用户
MariaDB [(none)]> create user 'repluser'@'10.0.0.%';
Query OK, 0 rows affected (0.00 sec)
#授权复制用户权限
MariaDB [(none)]> grant replication slave on *.* to 'repluser'@'10.0.0.%';
Query OK, 0 rows affected (0.00 sec)

#备份数据
[root@mysql-master ~]# mysqldump --all-databases --single_transaction --flush-logs --master-data=2 \
--lock-tables > /opt/backup.sql

#将备份数据复制到slave节点
[root@mysql-master ~]# scp /opt/backup.sql 10.0.0.27:/opt/

#查看二进制文件和位置
[root@mysql-master ~]# mysql
MariaDB [(none)]> show master logs;
+--------------------+-----------+
| Log_name           | File_size |
+--------------------+-----------+
| mariadb-bin.000001 |     29733 |
| mariadb-bin.000002 |       245 |
+--------------------+-----------+

2 rows in set (0.00 sec)

slave(10.0.0.27)

#修改配置
vim /etc/my.cnf.d/server.cnf
[mysqld]
bind=0.0.0.0
server-id=27
read-only

#重启数据库
systemctl restart mariadb

# 导入master节点备份数据
[root@slave ~]# mysql < /opt/backup.sql

#根据master信息开启同步设置
#其中MASTER_LOG_FILE、MASTER_LOG_POS对应master节点中Log_name、File_size(可通过命令show master logs查看)
[root@mysql-slave ~]# mysql
MariaDB [(none)]> CHANGE MASTER TO
  MASTER_HOST='10.0.0.17',
  MASTER_USER='repluser',
  MASTER_PASSWORD='',
  MASTER_PORT=3306,
  MASTER_LOG_FILE='mariadb-bin.000001',
  MASTER_LOG_POS=29733,
  MASTER_CONNECT_RETRY=10;

#开启slave
MariaDB [(none)]> start slave;

#显示状态信息
MariaDB [(none)]> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.17
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mariadb-bin.000002
          Read_Master_Log_Pos: 245
               Relay_Log_File: mariadb-relay-bin.000003
                Relay_Log_Pos: 531
        Relay_Master_Log_File: mariadb-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
......
             Master_Server_Id: 17

2)利用percona工具实现监控

官网下载地址:https://www.percona.com/downloads/

安装包:https://www.percona.com/downloads/percona-monitoring-plugins/LATEST/

  1. 安装percona插件
#下载
wget https://downloads.percona.com/downloads/percona-monitoring-plugins/percona-monitoring-plugins-1.1.8/binary/redhat/7/x86_64/percona-zabbix-templates-1.1.8-1.noarch.rpm
#安装
yum install -y percona-zabbix-templates-1.1.8-1.noarch.rpm
#安装php
yum install -y php php-mysql

#复制模板
cp /var/lib/zabbix/percona/templates/userparameter_percona_mysql.conf /etc/zabbix_agentd.conf.d/

#创建连接mysql数据库的php配置文件
vim /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php.cnf
<?php
$mysql_user = 'root';
$mysql_pass = ''; 

#重启
systemctl restart zabbix-agent
  1. 在zabbix-server上测试
[root@zabbix ~]# /apps/zabbix/bin/zabbix_get -s 10.0.0.17 -p 10050 -k MySQL.Key-reads
19
[root@zabbix ~]# /apps/zabbix/bin/zabbix_get -s 10.0.0.27 -p 10050 -k MySQL.Key-reads
0
  1. 关联主机模板

    注意:默认的模板/var/lib/zabbix/percona/templates/zabbix_agent_template_percona_mysql_server_ht_2.0.9-sver1.1.8.xml不可用,需要进行修改。

    模板参考:siyuan://blocks/20220715151809-f0mrj0m

image.png

  1. 查看监控状态

image.png

  1. 监控类型更改为主动式

image.png

  1. 验证监控

    image.png

4. 问题

1. 主动模式下监控数据正常,但ZBX图标为灰色未变绿

解决方法:将模板Template OS Linux by Zabbix agent active中的链接模板Template Module Zabbix agent active先取消链接并清理,再添加Template Module Zabbix agent模板。

image.png

ZBX图标变绿

image.png

八、zabbix实现故障和恢复的邮件通知

1. 实现故障自治愈

1)agent开启远程执行命令权限

[root@nginx tmp]# grep '^[a-Z]' /etc/zabbix_agentd.conf
PidFile=/run/zabbix/zabbix_agentd.pid
LogFile=/var/log/zabbix/zabbix_agentd.log
LogFileSize=0
EnableRemoteCommands=1		#开启远程执行命令功能
Server=10.0.0.100
ListenPort=10050
StartAgents=3
ServerActive=10.0.0.100
Hostname=10.0.0.7
User=zabbix
UnsafeUserParameters=1		#允许远程执行命令的时候使用不安全的参数(特殊的字符串)
Include=/etc/zabbix_agentd.conf.d/*.conf

2)agent添加zabbix用户授权

[root@nginx ~]# vim /etc/sudoers
......
root    ALL=(ALL)   ALL
zabbix ALL=NOPASSWD:ALL		#授权zabbix用户执行特殊命令不再需要密码,比如sudo命令

重启服务

systemctl restart zabbix-agent

3)创建动作

image.png

image.png

2. 实现邮件通知

1) 邮箱开启SMTP

进入个人邮箱,开启SMTP功能

image.png

发短信获取授权码

image.png

2) 创建报警媒介类型

设置邮箱参考:https://service.mail.qq.com/cgi-bin/help?subtype=1&&id=28&&no=371

密码是前面获取的授权码

image.png

3)给用户添加报警媒介

选择Admin用户

image.png

选择报警媒介,点击添加

image.png

类型选择前面创建的报警媒介,收件人选择要发送信息的对象

image.png

更新报警媒介

image.png

4)创建动作

image.png

image.png

发送故障时的邮件通告内容

image.png

恢复后的邮件通告内容

image.png

3. 验证故障告警邮件及恢复邮件通告功能

1)关闭nginx服务

查看80端口

image.png

nginx自动恢复

image.png

2)zabbix能够自动执行恢复指令及发送通知邮件

image.png

3)登录个人邮箱,查看告警邮件信息

image.png

标签:10.0,Redis,redis,zabbix,6379,Zabbix,master,sentinel
来源: https://www.cnblogs.com/areke/p/16482870.html