其他分享
首页 > 其他分享> > Docker进阶之02-Swarm集群入门实践

Docker进阶之02-Swarm集群入门实践

作者:互联网

Docker集群概述

Docker集群有2种方案:
1.在Docker Engine 1.12之前的集群模式被称为经典集群,这是通过API代理系统实现的集群,目前已经不再维护。
2.自Docker Engine 1.12及之后的版本,Docker引擎内置了Swarmkit来实现Docker的集群模式,这种集群模式是典型的主从架构,集群模式中的主机节点分为管理节点和工作节点。

如下示例是基于最新版的Docker集群模式进行。

集群主机:

主机名 主机IP 集群角色
ubuntu1804 192.168.20.131 管理节点
ubuntu180402 192.168.20.132 工作节点
ubuntu180403 192.168.20.133 工作节点

Docker集群实践

创建集群

如下命令在集群管理节点执行。

# 初始化一个Docker集群
$ docker swarm init --advertise-addr 192.168.20.131
Swarm initialized: current node (n4kf30mgtukzq2dw0hltgk8t7) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-238k85hnwkj5ywgaliinszqxsird3bsuchtxwj03mzn99jkswk-5sxijlmlo9oab54q5x8b0ow0f 192.168.20.131:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

# 查看集群模式是否已经开启
$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 1
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: active #  集群模式已经激活
  NodeID: n4kf30mgtukzq2dw0hltgk8t7
  Is Manager: true
  ClusterID: neyx9lrs6wy134yhrakyb4p45
  Managers: 1
  Nodes: 1
  Default Address Pool: 10.0.0.0/8  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 192.168.20.131
  Manager Addresses:
   192.168.20.131:2377
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.15.0-189-generic
 Operating System: Ubuntu 18.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 1.922GiB
 Name: ubuntu1804
 ID: 37C5:6IDP:3N2E:5WWX:QZRH:NKWQ:N5DO:TQLP:3PIU:5ABU:TH6Y:AWEA
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  http://hub-mirror.c.163.com/
 Live Restore Enabled: false

WARNING: No swap limit support

# 查看节点信息
$ docker node ls
ID                            HOSTNAME     STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
n4kf30mgtukzq2dw0hltgk8t7 *   ubuntu1804   Ready     Active         Leader           20.10.17

加入集群

如下命令在集群工作节点执行。
工作节点加入集群的命令可以在管理节点上获取,在管理节点上执行如下命令:

$ docker swarm join-token worker
To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-238k85hnwkj5ywgaliinszqxsird3bsuchtxwj03mzn99jkswk-5sxijlmlo9oab54q5x8b0ow0f 192.168.20.131:2377

然后分别到各个工作节点执行如下命令:

$ docker swarm join --token SWMTKN-1-238k85hnwkj5ywgaliinszqxsird3bsuchtxwj03mzn99jkswk-5sxijlmlo9oab54q5x8b0ow0f 192.168.20.131:2377
This node joined a swarm as a worker.

再次到集群管理节点查看集群节点情况:

$ docker node ls
ID                            HOSTNAME       STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
n4kf30mgtukzq2dw0hltgk8t7 *   ubuntu1804     Ready     Active         Leader           20.10.17
r1p3cziqc10rsva03b6unq754     ubuntu180402   Ready     Active                          20.10.17
x39xivinz9fwvifprbqrnarf8     ubuntu180403   Ready     Active                          20.10.17

在集群中部署服务

在集群管理节点执行部署服务命令:

$ docker service create --replicas 1 --name helloworld alpine ping docker.com
thngg6ia686cfpaigibns64pm
overall progress: 1 out of 1 tasks 
1/1: running   [==================================================>] 
verify: Service converged

查看服务列表:

$ docker service ls
ID             NAME         MODE         REPLICAS   IMAGE           PORTS
thngg6ia686c   helloworld   replicated   1/1        alpine:latest   

查看集群中部署的服务详情

# 该命令在集群管理节点执行
# 先查看服务列表,得到服务id和名称
$ docker service ls
ID             NAME         MODE         REPLICAS   IMAGE           PORTS
thngg6ia686c   helloworld   replicated   1/1        alpine:latest

# 查看服务详情
# 格式化展示服务信息
$ docker service inspect --pretty thngg6ia686c

ID:             thngg6ia686cfpaigibns64pm
Name:           helloworld
Service Mode:   Replicated
 Replicas:      1
Placement:
UpdateConfig:
 Parallelism:   1
 On failure:    pause
 Monitoring Period: 5s
 Max failure ratio: 0
 Update order:      stop-first
RollbackConfig:
 Parallelism:   1
 On failure:    pause
 Monitoring Period: 5s
 Max failure ratio: 0
 Rollback order:    stop-first
ContainerSpec:
 Image:         alpine:latest@sha256:7580ece7963bfa863801466c0a488f11c86f85d9988051a9f9c68cb27f6b7872
 Args:          ping docker.com 
 Init:          false
Resources:
Endpoint Mode:  vip

# 或者
$ docker service inspect thngg6ia686c
[
    {
        "ID": "thngg6ia686cfpaigibns64pm",
        "Version": {
            "Index": 21
        },
        "CreatedAt": "2022-07-31T07:35:45.769412012Z",
        "UpdatedAt": "2022-07-31T07:35:45.769412012Z",
        "Spec": {
            "Name": "helloworld",
            "Labels": {},
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "alpine:latest@sha256:7580ece7963bfa863801466c0a488f11c86f85d9988051a9f9c68cb27f6b7872",
                    "Args": [
                        "ping",
                        "docker.com"
                    ],
                    "Init": false,
                    "StopGracePeriod": 10000000000,
                    "DNSConfig": {},
                    "Isolation": "default"
                },
                "Resources": {
                    "Limits": {},
                    "Reservations": {}
                },
                "RestartPolicy": {
                    "Condition": "any",
                    "Delay": 5000000000,
                    "MaxAttempts": 0
                },
                "Placement": {
                    "Platforms": [
                        {
                            "Architecture": "amd64",
                            "OS": "linux"
                        },
                        {
                            "OS": "linux"
                        },
                        {
                            "OS": "linux"
                        },
                        {
                            "Architecture": "arm64",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "386",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "ppc64le",
                            "OS": "linux"
                        },
                        {
                            "Architecture": "s390x",
                            "OS": "linux"
                        }
                    ]
                },
                "ForceUpdate": 0,
                "Runtime": "container"
            },
            "Mode": {
                "Replicated": {
                    "Replicas": 1
                }
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "FailureAction": "pause",
                "Monitor": 5000000000,
                "MaxFailureRatio": 0,
                "Order": "stop-first"
            },
            "RollbackConfig": {
                "Parallelism": 1,
                "FailureAction": "pause",
                "Monitor": 5000000000,
                "MaxFailureRatio": 0,
                "Order": "stop-first"
            },
            "EndpointSpec": {
                "Mode": "vip"
            }
        },
        "Endpoint": {
            "Spec": {}
        }
    }
]

# 查看服务在哪个集群节点运行
# 在本示例中服务是在管理节点运行的,状态中运行中 
$ docker service ps thngg6ia686c
ID             NAME           IMAGE           NODE         DESIRED STATE   CURRENT STATE           ERROR     PORTS
wv9l1f8orjpi   helloworld.1   alpine:latest   ubuntu1804   Running         Running 8 minutes ago    

# 在管理节点上查看服务运行的容器信息
$ docker ps
CONTAINER ID   IMAGE           COMMAND             CREATED          STATUS          PORTS     NAMES
10464956ead3   alpine:latest   "ping docker.com"   10 minutes ago   Up 10 minutes             helloworld.1.wv9l1f8orjpiawmfq7r8nl0tm

扩容服务

所谓扩容服务就是调整服务运行的容器数量。
命令格式:

$ docker service scale <SERVICE-ID>=<NUMBER-OF-TASKS>

说明:服务中运行的容器称为“task”,所以上述命令中的<NUMBER-OF-TASKS>指的是服务中运行的容器数量。

$ docker service scale thngg6ia686c=5
thngg6ia686c scaled to 5
overall progress: 5 out of 5 tasks 
1/5: running   [==================================================>] 
2/5: running   [==================================================>] 
3/5: running   [==================================================>] 
4/5: running   [==================================================>] 
5/5: running   [==================================================>] 
verify: Service converged 

对服务扩容之后再次查看服务节点信息:

$ docker service ps thngg6ia686c
ID             NAME           IMAGE           NODE           DESIRED STATE   CURRENT STATE                ERROR     PORTS
wv9l1f8orjpi   helloworld.1   alpine:latest   ubuntu1804     Running         Running 18 minutes ago                 
3qjaozoybk4n   helloworld.2   alpine:latest   ubuntu180403   Running         Running about a minute ago             
c5tpqa3cceit   helloworld.3   alpine:latest   ubuntu180403   Running         Running about a minute ago             
qzfhsy8hlhq8   helloworld.4   alpine:latest   ubuntu1804     Running         Running about a minute ago             
zb6sy91hqlqt   helloworld.5   alpine:latest   ubuntu180402   Running         Running about a minute ago

显然,helloworld服务一共运行了5个容器,其中有2个容器运行在管理节点ubuntu1804,有2个容器运行在工作节点ubuntu180403,而另外一个容器则运行在工作节点ubuntu180402上。

分别到对应节点查看容器信息:

# 在管理节点ubuntu1804查看容器信息
$ docker ps
CONTAINER ID   IMAGE           COMMAND             CREATED          STATUS          PORTS     NAMES
664b4330b0e7   alpine:latest   "ping docker.com"   3 minutes ago    Up 3 minutes              helloworld.4.qzfhsy8hlhq851exsha9siwd7
10464956ead3   alpine:latest   "ping docker.com"   20 minutes ago   Up 20 minutes             helloworld.1.wv9l1f8orjpiawmfq7r8nl0tm

# 在工作节点ubuntu180402查看容器信息
$ docker ps
CONTAINER ID   IMAGE           COMMAND             CREATED         STATUS         PORTS     NAMES
fb6fb61d5534   alpine:latest   "ping docker.com"   3 minutes ago   Up 3 minutes             helloworld.5.zb6sy91hqlqtcc4e03c50rtk1

# 在工作节点ubuntu180403查看容器信息
$ docker ps
CONTAINER ID   IMAGE           COMMAND             CREATED         STATUS         PORTS     NAMES
90b36686eb9e   alpine:latest   "ping docker.com"   3 minutes ago   Up 3 minutes             helloworld.3.c5tpqa3cceitgvi00idxzmnnm
26771a491b7a   alpine:latest   "ping docker.com"   3 minutes ago   Up 3 minutes             helloworld.2.3qjaozoybk4nod2m165xlism6

删除服务

删除集群中服务的命令格式:

$ docker service rm <SERVICE-ID>

删除helloworld服务:

$ docker service rm thngg6ia686c
thngg6ia686c

删除服务之后再次查看服务详情时提示服务不存在:

$ docker service inspect thngg6ia686c
[]
Status: Error: no such service: thngg6ia686c, Code: 1

删除服务之后,集群中各个节点上的容器也将对应被删除。

滚动更新服务

为了执行实现服务的滚动更新,在创建服务时需要使用--update-delay选项指定一个更新延迟时间,单位可以是h(小时),m(分钟),s(秒)。
默认情况下,调度器一次只更新一个任务,也可以使用--update-parallelism选项指定一次同时更新的最大任务数。
默认情况下,当单个任务的更新返回RUNNING状态时,调度器再调度下一个任务进行更新,直到所有任务都更新完毕,也可以在命令docker service createdocker service update使用--update-failure-action选项进行控制。

如下将演示对redis服务的滚动更新:从6.0.16更新到6.2

$ docker service create --replicas 3 --name redis --update-delay 10s redis:6.0.16
3lxjlfktrwykf9kkwtd2pyfwy
overall progress: 3 out of 3 tasks 
1/3: running   [==================================================>] 
2/3: running   [==================================================>] 
3/3: running   [==================================================>] 
verify: Service converged

查看服务运行的节点信息:

$ docker service ps 3lxjlfktrwyk
ID             NAME      IMAGE          NODE           DESIRED STATE   CURRENT STATE                ERROR     PORTS
rwhmyx1g3x9r   redis.1   redis:6.0.16   ubuntu180402   Running         Running about a minute ago             
y00tkcs6gpu9   redis.2   redis:6.0.16   ubuntu180403   Running         Running 9 minutes ago                  
n57q8pa98oak   redis.3   redis:6.0.16   ubuntu1804     Running         Running 7 minutes ago

查看服务详情:

$ docker service inspect --pretty 3lxjlfktrwyk

ID:             3lxjlfktrwykf9kkwtd2pyfwy
Name:           redis
Service Mode:   Replicated
 Replicas:      3
Placement:
UpdateConfig:
 Parallelism:   1
 Delay:         10s
 On failure:    pause
 Monitoring Period: 5s
 Max failure ratio: 0
 Update order:      stop-first
RollbackConfig:
 Parallelism:   1
 On failure:    pause
 Monitoring Period: 5s
 Max failure ratio: 0
 Rollback order:    stop-first
ContainerSpec:
 Image:         redis:6.0.16@sha256:8e67c8caf4537cd85a2284347c4f52c723b636769a06891e73703563de16469f # redis运行的版本是6.0.16
 Init:          false
Resources:
Endpoint Mode:  vip

执行如下命令将redis6.0.16升级到6.2:

$ docker service update --image redis:6.2 3lxjlfktrwyk
3lxjlfktrwyk
overall progress: 3 out of 3 tasks 
1/3: running   [==================================================>] 
2/3: running   [==================================================>] 
3/3: running   [==================================================>] 
verify: Service converged

更新完毕之后在来查看服务详情:

$ docker service ps 3lxjlfktrwyk
ID             NAME          IMAGE          NODE           DESIRED STATE   CURRENT STATE                 ERROR     PORTS
1ycp82mqe3w3   redis.1       redis:6.2      ubuntu180402   Running         Running 50 seconds ago                  
rwhmyx1g3x9r    \_ redis.1   redis:6.0.16   ubuntu180402   Shutdown        Shutdown 59 seconds ago                 
zyw7m6k162hb   redis.2       redis:6.2      ubuntu180403   Running         Running 31 seconds ago                  
y00tkcs6gpu9    \_ redis.2   redis:6.0.16   ubuntu180403   Shutdown        Shutdown 38 seconds ago                 
6lfoqssiopka   redis.3       redis:6.2      ubuntu1804     Running         Running about a minute ago              
n57q8pa98oak    \_ redis.3   redis:6.0.16   ubuntu1804     Shutdown        Shutdown about a minute ago  

从输出信息中可以看出,6.0.16版本的Redis已经停止,正在运行的是6.2版本的Redis,说明滚动更新已经成功执行并完成了。

默认情况下,调度器应用滚动更新的步骤如下:
1.停止第一个任务
2.为已经停止的任务调度更新
3.启动更新任务的容器
4.如果对任务的更新返回RUNNING,等待指定的延迟时间(--update-delay选项指定)后开始更新下一个任务
5.如果在更新期间有任务返回FAILED,则停止任务更新

从Docker Swarm集群的更新策略来看,可能存在某些容器被更新成功了,而有的容器却没有被更新。

下线节点

处于某种目的,需要将将集群中的某个节点下线。
注意:这里的下线是指该节点不再承担集群节点的责任,比如:将不再接收在集群中部署服务的任务,但是并不影响可以在该节点上独立运行容器。

查看当前集群节点状态:

$ docker node ls
ID                            HOSTNAME       STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
n4kf30mgtukzq2dw0hltgk8t7 *   ubuntu1804     Ready     Active         Leader           20.10.17
r1p3cziqc10rsva03b6unq754     ubuntu180402   Ready     Active                          20.10.17
x39xivinz9fwvifprbqrnarf8     ubuntu180403   Ready     Active                          20.10.17

显然,当前集群中的各个节点状态是正常的。

假设现在需要将名称为ubuntu180403的节点下线。
命令模板:

$ docker node update --availability drain <NODE-ID>
# 下线集群节点: ubuntu180403
$ docker node update  --availability drain x39xivinz9fwvifprbqrnarf8
x39xivinz9fwvifprbqrnarf8

此时再来看集群节点状态:

$ docker node ls
ID                            HOSTNAME       STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
n4kf30mgtukzq2dw0hltgk8t7 *   ubuntu1804     Ready     Active         Leader           20.10.17
r1p3cziqc10rsva03b6unq754     ubuntu180402   Ready     Active                          20.10.17
x39xivinz9fwvifprbqrnarf8     ubuntu180403   Ready     Drain                           20.10.17

节点ubuntu180403变成了Drain

也可以查询节点详情:

$ docker node inspect --pretty x39xivinz9fwvifprbqrnarf8
ID:                     x39xivinz9fwvifprbqrnarf8
Hostname:               ubuntu180403
Joined at:              2022-07-31 07:31:14.720949412 +0000 utc
Status:
 State:                 Ready
 Availability:          Drain # 处于Drain状态
 Address:               192.168.20.133
Platform:
 Operating System:      linux
 Architecture:          x86_64
Resources:
 CPUs:                  2
 Memory:                1.922GiB
Plugins:
 Log:           awslogs, fluentd, gcplogs, gelf, journald, json-file, local, logentries, splunk, syslog
 Network:               bridge, host, ipvlan, macvlan, null, overlay
 Volume:                local
Engine Version:         20.10.17
TLS Info:
 TrustRoot:
-----BEGIN CERTIFICATE-----
MIIBajCCARCgAwIBAgIUCJDuGh7C7z0MnoExf6/61PYFJ0gwCgYIKoZIzj0EAwIw
EzERMA8GA1UEAxMIc3dhcm0tY2EwHhcNMjIwNzMxMDcwODAwWhcNNDIwNzI2MDcw
ODAwWjATMREwDwYDVQQDEwhzd2FybS1jYTBZMBMGByqGSM49AgEGCCqGSM49AwEH
A0IABOVrDuLnZhlJJFsgWkZIulSRnAFWJNxNjzhBdiNGzMkFwyOv3yQkcTYfGpb9
SBxtXqtbe7VIY/wN3P1zgsBwT0GjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMB
Af8EBTADAQH/MB0GA1UdDgQWBBQGto4fl4Ui2t+i8MDvPpJR5o+5BDAKBggqhkjO
PQQDAgNIADBFAiBq+jgAEQGw8B5BaQNAynZs4fvpdTDQZmKF0JMyl55n7AIhANea
t3A86SNOA56whYLkMm84teALAkjI3AR0cTwCzQXx
-----END CERTIFICATE-----

 Issuer Subject:        MBMxETAPBgNVBAMTCHN3YXJtLWNh
 Issuer Public Key:     MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE5WsO4udmGUkkWyBaRki6VJGcAVYk3E2POEF2I0bMyQXDI6/fJCRxNh8alv1IHG1eq1t7tUhj/A3c/XOCwHBPQQ==

在来看集群中服务的状态:

$ docker service  ps redis 
ID             NAME          IMAGE          NODE           DESIRED STATE   CURRENT STATE             ERROR                              PORTS
6il5oakb3g6n   redis.1       redis:6.2      ubuntu1804     Running         Running 16 minutes ago                                       
l9p05lgbim1b    \_ redis.1   redis:6.2      ubuntu1804     Shutdown        Failed 16 minutes ago     "No such container: redis.1.l9…"   
bi9anac8pjn8    \_ redis.1   redis:6.2      ubuntu1804     Shutdown        Failed 2 hours ago        "No such container: redis.1.bi…"   
95wxwgquz3bj    \_ redis.1   redis:6.2      ubuntu1804     Shutdown        Shutdown 2 hours ago                                         
rwhmyx1g3x9r    \_ redis.1   redis:6.0.16   ubuntu180402   Shutdown        Shutdown 15 minutes ago                                      
q99yn35pg712   redis.2       redis:6.2      ubuntu1804     Running         Running 16 minutes ago     # 会在当前新的集群节点上运行一个系的服务任务                                  
kh5clss94v0w    \_ redis.2   redis:6.2      ubuntu1804     Shutdown        Failed 16 minutes ago     "No such container: redis.2.kh…"   
rsvijco1pbsw    \_ redis.2   redis:6.2      ubuntu1804     Shutdown        Failed 2 hours ago        "No such container: redis.2.rs…"   
btgkuo7lnxom    \_ redis.2   redis:6.2      ubuntu1804     Shutdown        Shutdown 2 hours ago                                         
zyw7m6k162hb    \_ redis.2   redis:6.2      ubuntu180403   Shutdown        Shutdown 15 minutes ago    # 被下线的节点上的服务任务页被停止了                                  
y4q1yj1bjjof   redis.3       redis:6.2      ubuntu1804     Running         Running 16 minutes ago                                       
hd3vimy76hur    \_ redis.3   redis:6.2      ubuntu1804     Shutdown        Failed 16 minutes ago     "No such container: redis.3.hd…"   
rysst79qpko3    \_ redis.3   redis:6.2      ubuntu1804     Shutdown        Failed 2 hours ago        "No such container: redis.3.ry…"   
6lfoqssiopka    \_ redis.3   redis:6.2      ubuntu1804     Shutdown        Failed 2 hours ago        "task: non-zero exit (255)"        
n57q8pa98oak    \_ redis.3   redis:6.0.16   ubuntu1804     Shutdown        Shutdown 2 hours ago     

上线节点

这里的上线节点,一定是先下线,如果节点从来就未加入集群,则不允许执行该操作。

命令模板:

$ docker node update --availability active <NODE-ID>

示例:

# 激活之前的集群状态
$ docker node ls
ID                            HOSTNAME       STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
n4kf30mgtukzq2dw0hltgk8t7 *   ubuntu1804     Ready     Active         Leader           20.10.17
r1p3cziqc10rsva03b6unq754     ubuntu180402   Ready     Active                          20.10.17
x39xivinz9fwvifprbqrnarf8     ubuntu180403   Ready     Drain                           20.10.17  # 该节点被下线了

# 上线节点
docker node update --availability active x39xivinz9fwvifprbqrnarf8
x39xivinz9fwvifprbqrnarf8

# 再次查看上线节点之后的集群状态
# 所有节点都处于激活状态
$ docker node ls
ID                            HOSTNAME       STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
n4kf30mgtukzq2dw0hltgk8t7 *   ubuntu1804     Ready     Active         Leader           20.10.17
r1p3cziqc10rsva03b6unq754     ubuntu180402   Ready     Active                          20.10.17
x39xivinz9fwvifprbqrnarf8     ubuntu180403   Ready     Active                          20.10.17

最后总结

在实践中发现,即使集群节点意外宕机,重启成功之后会自动加入Docker集群,并运行之前分配到该节点的服务任务。

【参考】
https://docs.docker.com/engine/swarm/ Swarm mode overview
https://www.cnblogs.com/xishuai/p/docker-swarm.html Docker 三剑客之 Docker Swarm
https://laravelacademy.org/post/21850 Docker Swarm
https://blog.csdn.net/bbj12345678/article/details/115918651 Docker Swarm简介
https://www.cnblogs.com/fundebug/p/6823897.html 生产环境中使用Docker Swarm的一些建议

标签:02,ago,进阶,redis,Swarm,Running,集群,docker,节点
来源: https://www.cnblogs.com/nuccch/p/16538590.html