Nginx+SpringBoot主从节点无感部署

整体架构图

架构概述

单服务器主从架构,通过 Nginx 反向代理实现主备节点自动故障转移。

架构图

服务器信息

  • 主机:阿里云 ECS
  • IP:1.2.3.4
  • 域名:epay.twenhub.com / epaydoc.twenhub.com
  • Web 服务器:OpenResty 1.27.1.2

服务端口分配

  • HTTP:80(强制跳转 HTTPS)
  • HTTPS:443(主站访问)
  • Master 主节点:9988
  • Slave 从节点:9989
  • NotifyPro 服务:41005

目录结构

/home/server/
├── backend/digital_card_backend/
│   ├── master/
│   │   ├── deploy_master.sh
│   │   ├── jeecg-system-start-3.4.3.jar
│   │   └── master.log
│   └── slave/
│       ├── deploy_slave.sh
│       ├── jeecg-system-start-3.4.3.jar
│       └── slave.log
├── fontend/
│   ├── digital_card_frontend/    # 主站前端
│   └── doc/                      # 文档站前端
└── /home/flowapp/
    └── jeecg-system-start-3.4.3.tgz    # 部署制品

/home/nginxcert/
├── _.twenhub.com.crt
└── _.twenhub.com.key

/usr/local/openresty/nginx/conf/
└── nginx.conf

Nginx 配置

完整配置文件

路径:/usr/local/openresty/nginx/conf/nginx.conf

# user  nginx;
worker_processes  auto;
worker_rlimit_nofile 65535;
worker_shutdown_timeout 10s;

error_log  logs/error.log  warn;
pid        logs/nginx.pid;

events {
    worker_connections 65535;
    multi_accept on;
    use epoll;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    # Gzip 压缩
    gzip  on;
    gzip_min_length 1k;
    gzip_comp_level 6;
    gzip_vary  on;
    gzip_disable "MSIE [1-6]\\.";
    gzip_types
        text/plain text/css
        application/javascript application/json
        application/xml text/javascript;

    sendfile        on;
    keepalive_timeout 65;
    server_tokens off;
    client_max_body_size 30m;
    client_body_timeout 300s;

    # WebSocket 支持
    map $http_upgrade $connection_upgrade {
        default upgrade;
        ''      "";
    }

    # 后端主备服务
    upstream backend_service {
        zone backend_zone 64k;
        server 127.0.0.1:9988 max_fails=2 fail_timeout=5s;
        server 127.0.0.1:9989 backup max_fails=1 fail_timeout=5s;
        keepalive 64;
        keepalive_timeout 60s;
        keepalive_requests 1000;
    }

    # 通知服务
    upstream notify_service {
        zone notify_zone 64k;
        server 127.0.0.1:41005 max_fails=2 fail_timeout=5s;
        keepalive 32;
        keepalive_timeout 60s;
        keepalive_requests 1000;
    }

    # HTTP 跳转 HTTPS
    server {
        listen       80;
        listen       [::]:80;
        server_name  epay.twenhub.com;
        return 301 https://$host$request_uri;
    }

    # 文档站
    server {
        listen       443 ssl;
        server_name  epaydoc.twenhub.com;
    
        ssl_certificate      /home/nginxcert/_.twenhub.com.crt;
        ssl_certificate_key  /home/nginxcert/_.twenhub.com.key;
        ssl_session_cache    shared:SSL:10m;
        ssl_session_timeout  10m;
        ssl_protocols        TLSv1.2 TLSv1.3;
        ssl_ciphers          HIGH:!aNULL:!MD5;
        ssl_prefer_server_ciphers on;
    
        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
        add_header Cache-Control "no-store, no-cache, must-revalidate, max-age=0" always;
        add_header Pragma "no-cache" always;
        add_header Expires "0" always;
    
        root  /home/server/fontend/doc;
        index index.html index.htm;
    
        location / {
            try_files $uri $uri/ =404;
        }
    }

    # 主站
    server {
        listen       443 ssl;
        listen       [::]:443 ssl;
        http2        on;
        server_name  epay.twenhub.com;

        ssl_certificate      /home/nginxcert/_.twenhub.com.crt;
        ssl_certificate_key  /home/nginxcert/_.twenhub.com.key;
        ssl_session_cache    shared:SSL:10m;
        ssl_session_timeout  10m;
        ssl_protocols        TLSv1.2 TLSv1.3;
        ssl_ciphers          HIGH:!aNULL:!MD5;
        ssl_prefer_server_ciphers on;

        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

        # 前端 SPA
        root  /home/server/fontend/digital_card_frontend;
        index index.html;
        location / {
            try_files $uri $uri/ /index.html;
        }

        # 后端 API
        location /api/ {
            proxy_pass          http://backend_service;
            proxy_http_version  1.1;

            proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
            proxy_next_upstream_tries 2;

            proxy_connect_timeout  1s;
            proxy_send_timeout     30s;
            proxy_read_timeout     120s;

            proxy_set_header    Upgrade           $http_upgrade;
            proxy_set_header    Connection        $connection_upgrade;
            proxy_set_header    Host              $host;
            proxy_set_header    X-Real-IP         $remote_addr;
            proxy_set_header    X-Forwarded-For   $proxy_add_x_forwarded_for;
            proxy_set_header    X-Forwarded-Proto $scheme;
        }

        # 通知服务
        location /notify/ {
            proxy_pass          http://notify_service/notify_pro/;
            proxy_http_version  1.1;

            proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
            proxy_next_upstream_tries 2;

            proxy_connect_timeout  3s;
            proxy_send_timeout     300s;
            proxy_read_timeout     300s;

            proxy_set_header    Upgrade           $http_upgrade;
            proxy_set_header    Connection        $connection_upgrade;
            proxy_set_header    Host              $host;
            proxy_set_header    X-Real-IP         $remote_addr;
            proxy_set_header    X-Forwarded-For   $proxy_add_x_forwarded_for;
            proxy_set_header    X-Forwarded-Proto $scheme;
        }
    }
}

配置说明

主备策略

  • Master 节点(9988):主节点,处理所有请求
  • Slave 节点(9989):backup 模式,仅在主节点故障时接管
  • 故障判定:连续 2 次失败(max_fails=2),5 秒超时(fail_timeout=5s)
  • 快速失败:proxy_connect_timeout=1s,1 秒内无法连接则切换

连接复用

  • keepalive 64:保持 64 个空闲连接到后端
  • keepalive_timeout 60s:空闲连接保持 60 秒
  • keepalive_requests 1000:单个连接最多处理 1000 个请求后轮换

重试机制

  • proxy_next_upstream:定义触发重试的错误类型
  • proxy_next_upstream_tries 2:最多重试 2 次(含首次)

部署脚本

Master 主节点部署脚本

路径:/home/server/backend/digital_card_backend/master/deploy_master.sh

#!/bin/bash

#---------------------------配置开始----------------------------------
APP_NAME=jeecg-system-start-3.4.3
PROG_NAME=$0
ACTION=$1
APP_START_TIMEOUT_SECONDS=120
APP_PORT=9988
BASE_URL=http://127.0.0.1:${APP_PORT}/api
HEALTH_CHECK_URL=${BASE_URL}/noauth/heart
APP_HOME=/home/server/backend/digital_card_backend/master
JAR_NAME=${APP_HOME}/${APP_NAME}.jar
JAVA_OUT=${APP_HOME}/master.log
PACKAGE_PATH=/home/flowapp/jeecg-system-start-3.4.3.tgz
SHUTDOWN_TIMEOUT_SECONDS=40
SPRING_PROFILE="master"
#---------------------------配置结束----------------------------------

mkdir -p ${APP_HOME}

usage() {
    echo "Usage: $PROG_NAME {start|stop|restart|deploy}"
    exit 2
}

health_check() {
    exptime=0
    echo "checking ${HEALTH_CHECK_URL}"
    while true
        do
            status_code=`/usr/bin/curl -L -o /dev/null --connect-timeout 5 -s -w %{http_code}  ${HEALTH_CHECK_URL}`
            if [ "$?" != "0" ]; then
               echo -n -e "\rapplication not started"
            else
                echo "code is $status_code"
                if [ "$status_code" == "200" ];then
                    break
                fi
            fi
            sleep 1
            ((exptime++))

            echo -e "\rWait app to pass health check: $exptime..."

            if [ $exptime -gt ${APP_START_TIMEOUT_SECONDS} ]; then
                echo 'app start failed'
               exit 1
            fi
        done
    echo "check ${HEALTH_CHECK_URL} success"
}

start_application() {
    echo "starting java process"
    nohup java -jar -Xmx512m -Xms256m -Dspring.profiles.active=${SPRING_PROFILE} -Dserver.port=${APP_PORT} ${JAR_NAME} >${JAVA_OUT} 2>&1 &
    echo "started java process"
}

stop_application() {
    checkjavapid=$(ps -ef | grep java | grep ${APP_NAME} | grep ${APP_PORT} | grep -v grep | awk '{print $2}')
    
    if [[ ! $checkjavapid ]]; then
        echo "No java process to stop (process not found)"
        return
    fi

    echo "Sending SIGTERM to Java process with PID ${checkjavapid}."
    kill -15 ${checkjavapid}

    for ((i=0; i<$SHUTDOWN_TIMEOUT_SECONDS; i++)); do
        http_status=$(curl -o /dev/null -s -w "%{http_code}\n" ${HEALTH_CHECK_URL})
        
        if [ "$http_status" != "200" ]; then
            echo "Java process stopped gracefully."
            if [ ! -z "$checkjavapid" ]; then
                kill -9 $checkjavapid
                echo "Killed process on pid $checkjavapid."
            fi
            return
        fi
        
        sleep 1
        echo "Waiting for Java process to stop..."
    done

    echo "Java process did not stop after $SHUTDOWN_TIMEOUT_SECONDS seconds, sending SIGKILL"
    if [ ! -z "$checkjavapid" ]; then
        kill -9 $checkjavapid
        echo "Killed process on port $APP_PORT."
    fi
}

start() {
    start_application
    health_check
}

stop() {
    stop_application
}

deploy() {
    stop
    
    echo "Unpacking $PACKAGE_PATH to $APP_HOME"
    if tar zxvf $PACKAGE_PATH -C $APP_HOME; then
        echo "Unpack finished successfully."
        
        echo "Removing the package $PACKAGE_PATH"
        rm -f $PACKAGE_PATH
        if [ $? -eq 0 ]; then
            echo "Package removed successfully."
        else
            echo "Failed to remove the package."
        fi
    else
        echo "Error occurred during unpacking. Exiting."
        exit 1
    fi
    
    start
}

case "$ACTION" in
    start)
        start
    ;;
    stop)
        stop
    ;;
    restart)
        stop
        start
    ;;
    deploy)
        deploy
    ;;
    *)
        usage
    ;;
esac

Slave 从节点部署脚本

路径:/home/server/backend/digital_card_backend/slave/deploy_slave.sh

#!/bin/bash

#---------------------------配置开始----------------------------------
APP_NAME=jeecg-system-start-3.4.3
PROG_NAME=$0
ACTION=$1
APP_START_TIMEOUT_SECONDS=120
APP_PORT=9989
BASE_URL=http://127.0.0.1:${APP_PORT}/api
HEALTH_CHECK_URL=${BASE_URL}/noauth/heart
APP_HOME=/home/server/backend/digital_card_backend/slave
JAR_NAME=${APP_HOME}/${APP_NAME}.jar
JAVA_OUT=${APP_HOME}/slave.log
PACKAGE_PATH=/home/flowapp/jeecg-system-start-3.4.3.tgz
SHUTDOWN_TIMEOUT_SECONDS=40
SPRING_PROFILE="slave"
#---------------------------配置结束----------------------------------

mkdir -p ${APP_HOME}

usage() {
    echo "Usage: $PROG_NAME {start|stop|restart|deploy}"
    exit 2
}

health_check() {
    exptime=0
    echo "checking ${HEALTH_CHECK_URL}"
    while true
        do
            status_code=`/usr/bin/curl -L -o /dev/null --connect-timeout 5 -s -w %{http_code}  ${HEALTH_CHECK_URL}`
            if [ "$?" != "0" ]; then
               echo -n -e "\rapplication not started"
            else
                echo "code is $status_code"
                if [ "$status_code" == "200" ];then
                    break
                fi
            fi
            sleep 1
            ((exptime++))

            echo -e "\rWait app to pass health check: $exptime..."

            if [ $exptime -gt ${APP_START_TIMEOUT_SECONDS} ]; then
                echo 'app start failed'
               exit 1
            fi
        done
    echo "check ${HEALTH_CHECK_URL} success"
}

start_application() {
    echo "starting java process"
    nohup java -jar -Xmx512m -Xms256m -Dspring.profiles.active=${SPRING_PROFILE} -Dserver.port=${APP_PORT} ${JAR_NAME} >${JAVA_OUT} 2>&1 &
    echo "started java process"
}

stop_application() {
    checkjavapid=$(ps -ef | grep java | grep ${APP_NAME} | grep ${APP_PORT} | grep -v grep | awk '{print $2}')
    
    if [[ ! $checkjavapid ]]; then
        echo "No java process to stop (process not found)"
        return
    fi

    echo "Sending SIGTERM to Java process with PID ${checkjavapid}."
    kill -15 ${checkjavapid}

    for ((i=0; i<$SHUTDOWN_TIMEOUT_SECONDS; i++)); do
        http_status=$(curl -o /dev/null -s -w "%{http_code}\n" ${HEALTH_CHECK_URL})
        
        if [ "$http_status" != "200" ]; then
            echo "Java process stopped gracefully."
            if [ ! -z "$checkjavapid" ]; then
                kill -9 $checkjavapid
                echo "Killed process on pid $checkjavapid."
            fi
            return
        fi
        
        sleep 1
        echo "Waiting for Java process to stop..."
    done

    echo "Java process did not stop after $SHUTDOWN_TIMEOUT_SECONDS seconds, sending SIGKILL"
    if [ ! -z "$checkjavapid" ]; then
        kill -9 $checkjavapid
        echo "Killed process on port $APP_PORT."
    fi
}

start() {
    start_application
    health_check
}

stop() {
    stop_application
}

deploy() {
    stop
    
    echo "Unpacking $PACKAGE_PATH to $APP_HOME"
    if tar zxvf $PACKAGE_PATH -C $APP_HOME; then
        echo "Unpack finished successfully."
        
        echo "Removing the package $PACKAGE_PATH"
        rm -f $PACKAGE_PATH
        if [ $? -eq 0 ]; then
            echo "Package removed successfully."
        else
            echo "Failed to remove the package."
        fi
    else
        echo "Error occurred during unpacking. Exiting."
        exit 1
    fi
    
    start
}

case "$ACTION" in
    start)
        start
    ;;
    stop)
        stop
    ;;
    restart)
        stop
        start
    ;;
    deploy)
        deploy
    ;;
    *)
        usage
    ;;
esac

脚本说明

主从节点差异

配置项MasterSlave
APP_PORT99889989
APP_HOME.../master.../slave
JAVA_OUTmaster.logslave.log
SPRING_PROFILEmasterslave

脚本功能

  • start:启动服务并健康检查
  • stop:优雅停机(SIGTERM → 等待 40s → SIGKILL)
  • restart:先停止后启动
  • deploy:停止 → 解压制品 → 启动

健康检查

  • 检查地址:http://127.0.0.1:{PORT}/api/noauth/heart
  • 超时时间:120 秒
  • 检查间隔:1 秒
  • 成功条件:HTTP 200

优雅停机流程

  1. 发送 SIGTERM 信号(kill -15)
  2. 每秒检查健康接口
  3. 接口返回非 200 时执行 kill -9 清理
  4. 超过 40 秒强制 kill -9

云效流水线部署

云效流水线配置

流水线配置

流水线名称:digital_card_backend

代码源

  • 分支:master
  • 触发方式:Webhook 触发或手动触发

流程阶段

  1. Java 构建上传:构建 JAR 包并上传到制品库
  2. Slave 节点部署:部署从节点
  3. Master 节点部署:部署主节点

部署任务配置

部署任务配置

Slave 节点部署任务

主机组:部署
执行用户:root(或具有执行权限的用户)
部署脚本

sleep 3
cd /home/server/backend/digital_card_backend/slave
sh deploy_slave.sh deploy

说明

  • sleep 3:等待云效将制品(jeecg-system-start-3.4.3.tgz)上传到 /home/flowapp/ 目录
  • 脚本会自动停止旧服务、解压新制品、启动新服务并进行健康检查

Master 节点部署任务

主机组:部署
执行用户:root(或具有执行权限的用户)
部署脚本

sleep 3
cd /home/server/backend/digital_card_backend/master
sh deploy_master.sh deploy

说明

  • sleep 3:确保制品上传完成(与 Slave 共享同一个制品文件)
  • 由于部署顺序是先 Slave 后 Master,此时 Slave 已经启动完成

零停机部署原理

零停机部署流程

部署流程时序:

1. 云效构建完成,上传制品到 /home/flowapp/jeecg-system-start-3.4.3.tgz
   ↓
2. 延迟 3 秒(确保文件上传完整)
   ↓
3. 部署 Slave 节点
   - 停止旧 Slave(端口 9989)
   - 解压制品到 /home/server/backend/digital_card_backend/slave/
   - 启动新 Slave
   - 健康检查通过(最多等待 120 秒)
   【此时流量仍在 Master 9988,服务不中断】
   ↓
4. 延迟 3 秒
   ↓
5. 部署 Master 节点
   - 停止旧 Master(端口 9988)
   【Nginx 检测到 Master 不可用,自动切换流量到 Slave 9989】
   - 解压制品到 /home/server/backend/digital_card_backend/master/
   - 启动新 Master
   - 健康检查通过(最多等待 120 秒)
   【Nginx 检测到 Master 恢复,自动切换流量回 Master 9988】
   ↓
6. 部署完成,Slave 保持运行作为备用节点

Nginx 自动切换逻辑

  • Master 正常时:100% 流量到 Master(9988)
  • Master 故障时:100% 流量到 Slave(9989,backup 节点)
  • Master 恢复后:自动切回 Master
  • 切换检测:每次请求前检查,1 秒连接超时即判定故障

Nginx无感切换示意

部署触发方式

自动触发

  • 代码推送到 master 分支时,Webhook 自动触发流水线
  • 云效自动执行:构建 → 部署 Slave → 部署 Master

手动触发

  • 在云效控制台点击"运行"按钮
  • 选择指定分支或 commit 进行部署

部署验证

查看部署日志

  • 云效控制台 → 运行历史 → 点击对应的运行记录
  • 查看"Slave 节点"和"Master 节点"的部署详情