理解TCP之Keepalive

理解Keepalive(1)

大家都听过keepalive,但是其实对于keepalive这个词还是很晦涩的,至少我一直都只知道一个大概,直到之前排查线上一些问题,发现keepalive还是有很多玄机的。其实keepalive有两种,一种是TCP层的keepalive,另一种是HTTP层的Keep-Alive。这篇文章先说说tcp层的keepalive

tcp keepalive

设想有一种场景:A和B两边通过三次握手建立好TCP连接,然后突然间B就宕机了,之后时间内B再也没有起来。如果B宕机后A和B一直没有数据通信的需求,A就永远都发现不了B已经挂了,那么A的内核里还维护着一份关于A&B之间TCP连接的信息,浪费系统资源。于是在TCP层面引入了keepalive的机制,A会定期给B发空的数据包,通俗讲就是心跳包,一旦发现到B的网络不通就关闭连接。这一点在LVS内尤为明显,因为LVS维护着两边大量的连接状态信息,一旦超时就需要释放连接。

Linux内核对于tcp keepalive的调整主要有以下三个参数

1. tcp_keepalive_time

 the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further

2. tcp_keepalive_intvl

 the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime

3. tcp_keepalive_probes

 the number of unacknowledged probes to send before considering the connection dead and notifying the application layer

Example

$ cat /proc/sys/net/ipv4/tcp_keepalive_time
  7200
$ cat /proc/sys/net/ipv4/tcp_keepalive_intvl
  75
$ cat /proc/sys/net/ipv4/tcp_keepalive_probes
  9

当tcp发现有tcp_keepalive_time(7200)秒未收到对端数据后,开始以间隔tcp_keepalive_intvl(75)秒的频率发送的空心跳包,如果连续tcp_keepalive_probes(9)次以上未响应代码对端已经down了,close连接

在socket编程时候,可以调用setsockopt指定不同的宏来更改上面几个参数

TCP_KEEPCNT: tcp_keepalive_probes

TCP_KEEPIDLE: tcp_keepalive_time

TCP_KEEPINTVL: tcp_keepalive_intvl

Nginx配置tcp keepalive

Nginx对于keepalive的配置有一大堆,大伙每次看都迷茫了,其实Nginx涉及到tcp层面的keepalive只有一个:so_keepalive。它属于listen指令的配置参数,具体配置

so_keepalive=on|off|[keepidle]:[keepintvl]:[keepcnt]

this parameter (1.1.11) configures the “TCP keepalive” behavior for the listening socket. If this parameter is omitted then the operating system’s settings will be in effect for the socket. If it is set to the value “on”, the SO_KEEPALIVE option is turned on for the socket. If it is set to the value “off”, the SO_KEEPALIVE option is turned off for the socket. Some operating systems support setting of TCP keepalive parameters on a per-socket basis using the TCP_KEEPIDLE, TCP_KEEPINTVL, and TCP_KEEPCNT socket options. On such systems (currently, Linux 2.4+, NetBSD 5+, and FreeBSD 9.0-STABLE), they can be configured using the keepidle, keepintvl, and keepcnt parameters. One or two parameters may be omitted, in which case the system default setting for the corresponding socket option will be in effect.

  • Example
so_keepalive=30m::10
    will set the idle timeout (TCP_KEEPIDLE) to 30 minutes, leave the probe interval (TCP_KEEPINTVL) at its system default, and set the probes count (TCP_KEEPCNT) to 10 probes.

在Nginx的代码里可以看到

./src/http/ngx_http_core_module.c

static ngx_command_t  ngx_http_core_commands[] = {
...
    // listen 指令解析 -->> call ngx_http_core_listen()
    { ngx_string("listen"),
      NGX_HTTP_SRV_CONF|NGX_CONF_1MORE,
      ngx_http_core_listen,
      NGX_HTTP_SRV_CONF_OFFSET,
      0,
      NULL },
...
}

static char *
ngx_http_core_listen(ngx_conf_t *cf, ngx_command_t *cmd, void *conf){

...
        // 下面就是 so_keepalive 后面的参数解析
        if (ngx_strncmp(value[n].data, "so_keepalive=", 13) == 0) {

            if (ngx_strcmp(&value[n].data[13], "on") == 0) {
                lsopt.so_keepalive = 1;

            } else if (ngx_strcmp(&value[n].data[13], "off") == 0) {
                lsopt.so_keepalive = 2;

            } else {
            // 自定义系统keepalive的相关设置
            ...

        }

    if (ngx_http_add_listen(cf, cscf, &lsopt) == NGX_OK) {
        return NGX_CONF_OK;
    }

}

./src/core/ngx_connection.c


        if (ls[i].keepidle) {
            value = ls[i].keepidle;
            // 设置 tcp_keepalive_time 
            if (setsockopt(ls[i].fd, IPPROTO_TCP, TCP_KEEPIDLE,
                           (const void *) &value, sizeof(int))
                == -1)
            {
                ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_socket_errno,
                              "setsockopt(TCP_KEEPIDLE, %d) %V failed, ignored",
                              value, &ls[i].addr_text);
            }
        }

        if (ls[i].keepintvl) {
            value = ls[i].keepintvl;
            // 设置 tcp_keepalive_intvl
            if (setsockopt(ls[i].fd, IPPROTO_TCP, TCP_KEEPINTVL,
                           (const void *) &value, sizeof(int))
                == -1)
            {
                ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_socket_errno,
                             "setsockopt(TCP_KEEPINTVL, %d) %V failed, ignored",
                             value, &ls[i].addr_text);
            }
        }

        if (ls[i].keepcnt) {
            // 设置 tcp_keepalive_intvl
            if (setsockopt(ls[i].fd, IPPROTO_TCP, TCP_KEEPCNT,
                           (const void *) &ls[i].keepcnt, sizeof(int))
                == -1)
            {
                ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_socket_errno,
                              "setsockopt(TCP_KEEPCNT, %d) %V failed, ignored",
                              ls[i].keepcnt, &ls[i].addr_text);
            }
        }

总结

这篇文章说了TCP层面的keepalive相关知识以及Nginx的支持tcp keepalive的配置。tcp层面的keepalive存在更多意义上是为了检测两端连接是否正常,注重点是在于连接的本身!要和HTTP层面的keepaplive区分开来,明白这点很重要。

标签:none