2012-09-02 119 views
17

在过去的三周里,我们一直在测试Nginx作为负载均衡。 目前,我们无法处理超过1000个请求/秒和18K活动连接。 当我们得到上述数字时,Nginx开始挂起,并返回超时代码。 获得响应的唯一方法是显着减少连接数。Nginx高容量流量负载均衡

我必须注意到,我的服务器可以并且确实每天处理这些流量,我们目前使用简单的轮流rubin DNS平衡。

我们使用的是有以下硬件的专用服务器:

  • INTEL XEON E5620 CPU
  • 16GB RAM
  • 2T SATA硬盘
  • 1Gb/s的连接
  • OS:CentOS的5.8

我们需要负载平衡7台运行Tomca的后台服务器t6,处理超过2000次请求/秒,处理HTTP和HTTPS请求。

运行Nginx的cpu消耗约为15%,而使用的RAM约为100MB。

我的问题是:

  1. 有没有人试图加载平衡使用nginx的这种流量?
  2. 你认为nginx能处理这样的流量吗?
  3. 你有什么想法可以导致挂?
  4. 我错过了我的配置?

下面是我的配置文件:

nginx.conf:

user nginx; 
worker_processes 10; 

worker_rlimit_nofile 200000; 

error_log /var/log/nginx/error.log warn; 
pid  /var/run/nginx.pid; 


events { 
    worker_connections 10000; 
    use epoll; 
    multi_accept on; 
} 


http { 
    include  /etc/nginx/mime.types; 
    default_type application/octet-stream; 

    log_format main '$remote_addr - $remote_user [$time_local] "$request" ' 
         '$status $body_bytes_sent "$http_referer" ' 
         '"$http_user_agent" "$http_x_forwarded_for"'; 

    #access_log /var/log/nginx/access.log main; 
    access_log off; 

    sendfile  on; 
    tcp_nopush  on; 

    keepalive_timeout 65; 
    reset_timedout_connection on; 

    gzip on; 
    gzip_comp_level 1; 
    include /etc/nginx/conf.d/*.conf; 
} 

servers.conf:

#Set the upstream (servers to load balance) 
#HTTP stream 
upstream adsbar { 
    least_conn; 
    server xx.xx.xx.34 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.36 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.37 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.39 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.40 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.42 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.43 max_fails=2 fail_timeout=15s; 
}  

#HTTPS stream 
upstream adsbar-ssl { 
    least_conn; 
    server xx.xx.xx.34:443 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.36:443 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.37:443 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.39:443 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.40:443 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.42:443 max_fails=2 fail_timeout=15s; 
    server xx.xx.xx.43:443 max_fails=2 fail_timeout=15s; 
} 

#HTTP 
server { 
    listen xxx.xxx.xxx.xxx:8080; 
    server_name www.mycompany.com; 
    location/{ 
     proxy_set_header Host $host; 
     # So the original HTTP Host header is preserved 
     proxy_set_header X-Real-IP $remote_addr; 
     # The IP address of the client (which might be a proxy itself) 
     proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 
     proxy_pass http://adsbar; 
    } 
} 

#HTTPS 
server { 
    listen xxx.xxx.xxx.xxx:8443; 
    server_name www.mycompany.com; 
    ssl on; 
    ssl_certificate /etc/pki/tls/certs/mycompany.crt; 
    # Path to an SSL certificate; 
    ssl_certificate_key /etc/pki/tls/private/mycompany.key; 
    # Path to the key for the SSL certificate; 
    location/{ 
     proxy_set_header Host $host; 
     # So the original HTTP Host header is preserved 
     proxy_set_header X-Real-IP $remote_addr; 
     # The IP address of the client (which might be a proxy itself) 
     proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 
     proxy_pass https://adsbar-ssl; 
    } 
} 

server { 
    listen xxx.xxx.xxx.xxx:61709; 
    location /nginx_status { 
     stub_status on; 
     access_log off; 
     allow 127.0.0.1; 
     deny all; 
    } 
} 

sysctl.conf的:

# Kernel sysctl configuration file for Red Hat Linux 
# 
# For binary values, 

0 is disabled, 1 is enabled. See sysctl(8) and 
# sysctl.conf(5) for more details. 

# Controls IP packet forwarding 
net.ipv4.ip_forward = 0 

# Controls source route verification 
net.ipv4.conf.default.rp_filter = 1 

# Do not accept source routing 
net.ipv4.conf.default.accept_source_route = 0 

# Controls the System Request debugging functionality of the kernel 
kernel.sysrq = 1 

# Controls whether core dumps will append the PID to the core filename 
# Useful for debugging multi-threaded applications 
kernel.core_uses_pid = 1 

# Controls the use of TCP syncookies 
net.ipv4.tcp_syncookies = 1 

# Controls the maximum size of a message, in bytes 
kernel.msgmnb = 65536 

# Controls the default maxmimum size of a mesage queue 
kernel.msgmax = 65536 

# Controls the maximum shared segment size, in bytes 
kernel.shmmax = 68719476736 

# Controls the maximum number of shared memory segments, in pages 
kernel.shmall = 4294967296 

fs.file-max = 120000 
net.ipv4.ip_conntrack_max = 131072 
net.ipv4.tcp_max_syn_backlog = 8196 
net.ipv4.tcp_fin_timeout = 25 
net.ipv4.tcp_keepalive_time = 3600 
net.ipv4.ip_local_port_range = 1024 65000 
net.ipv4.tcp_rmem = 4096 25165824 25165824 
net.core.rmem_max = 25165824 
net.core.rmem_default = 25165824 
net.ipv4.tcp_wmem = 4096 65536 25165824 
net.core.wmem_max = 25165824 
net.core.wmem_default = 65536 
net.core.optmem_max = 25165824 
net.core.netdev_max_backlog = 2500 
net.ipv4.tcp_tw_recycle = 1 
net.ipv4.tcp_tw_reuse = 1 

任何帮助,指导,想法将不胜感激。

回答

3

nginx的应该肯定能够处理更多然后1000 REQ /秒(我在nginx的获得约2800 REQ/S使用一个和两个核的减半使用JMeter对我的廉价笔记本玩耍时)

据我所知,您正在使用epoll,这是目前Linux内核的最佳选择。(注意:您还可以将access_log设置为带有大缓冲区的缓冲模式,并在每个x kb后写入该缓冲区模式,其中避免磁盘io被不断敲打,但保留日志进行分析)

我的理解是,为了最大化nginx性能,您通常将worker_processes的数量设置为等于核心/ cpu的数量,然后增加数量的worker_connections允许更多的并发连接(以及打开文件的数量限制)。然而,在上面发布的数据中,您有一个带有10个工作进程的quadcore cpu,每个进程允许10k个连接。因此在nginx的身边,我会尝试这样的:

worker_processes 4; 
worker_rlimit_nofile 999999; 
events { 
    worker_connections 32768; 
    use epoll; 
    multi_accept on; 
} 

在内核方面我倒是调TCP读取和写入不同的缓冲区,你想有一个小的最小,小默认和大型最大。

您已经调高了临时端口范围。

我打开数字打开的文件限制更多,因为你会有很多打开的套接字。

其中给出以下行到您的/etc/sysctl.conf

net.ipv4.tcp_rmem = 4096 4096 25165824         
net.ipv4.tcp_wmem = 4096 4096 25165824 
fs.file-max=999999 

希望帮助添加/更改。

18

这里有一些很好的参考:

http://dak1n1.com/blog/12-nginx-performance-tuning

服务器故障:在Linux上

# This number should be, at maximum, the number of CPU cores on your system. 
# (since nginx doesn't benefit from more than one worker per CPU.) 
worker_processes 24; 

# Number of file descriptors used for Nginx. This is set in the OS with 'ulimit -n 200000' 
# or using /etc/security/limits.conf 
worker_rlimit_nofile 200000; 


# only log critical errors 
error_log /var/log/nginx/error.log crit 


# Determines how many clients will be served by each worker process. 
# (Max clients = worker_connections * worker_processes) 
# "Max clients" is also limited by the number of socket connections available on the system (~64k) 
worker_connections 4000; 


# essential for linux, optmized to serve many clients with each thread 
use epoll; 


# Accept as many connections as possible, after nginx gets notification about a new connection. 
# May flood worker_connections, if that option is set too low. 
multi_accept on; 


# Caches information about open FDs, freqently accessed files. 
# Changing this setting, in my environment, brought performance up from 560k req/sec, to 904k req/sec. 
# I recommend using some varient of these options, though not the specific values listed below. 
open_file_cache max=200000 inactive=20s; 
open_file_cache_valid 30s; 
open_file_cache_min_uses 2; 
open_file_cache_errors on; 


# Buffer log writes to speed up IO, or disable them altogether 
#access_log /var/log/nginx/access.log main buffer=16k; 
access_log off; 


# Sendfile copies data between one FD and other from within the kernel. 
# More efficient than read() + write(), since the requires transferring data to and from the user space. 
sendfile on; 


# Tcp_nopush causes nginx to attempt to send its HTTP response head in one packet, 
# instead of using partial frames. This is useful for prepending headers before calling sendfile, 
# or for throughput optimization. 
tcp_nopush on; 


# don't buffer data-sends (disable Nagle algorithm). Good for sending frequent small bursts of data in real time. 
tcp_nodelay on; 


# Timeout for keep-alive connections. Server will close connections after this time. 
keepalive_timeout 30; 


# Number of requests a client can make over the keep-alive connection. This is set high for testing. 
keepalive_requests 100000; 


# allow the server to close the connection after a client stops responding. Frees up socket-associated memory. 
reset_timedout_connection on; 


# send the client a "request timed out" if the body is not loaded by this time. Default 60. 
client_body_timeout 10; 


# If the client stops reading data, free up the stale client connection after this much time. Default 60. 
send_timeout 2; 


# Compression. Reduces the amount of data that needs to be transferred over the network 
gzip on; 
gzip_min_length 10240; 
gzip_proxied expired no-cache no-store private auth; 
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml; 
gzip_disable "MSIE [1-6]\."; 

而且更多的信息:从dak1n1链接 https://serverfault.com/questions/221292/tips-for-maximizing-nginx-requests-sec

一个非常有据可查的配置系统调整为sysctl.conf:

# Increase system IP port limits to allow for more connections 

net.ipv4.ip_local_port_range = 2000 65000 


net.ipv4.tcp_window_scaling = 1 


# number of packets to keep in backlog before the kernel starts dropping them 
net.ipv4.tcp_max_syn_backlog = 3240000 


# increase socket listen backlog 
net.core.somaxconn = 3240000 
net.ipv4.tcp_max_tw_buckets = 1440000 


# Increase TCP buffer sizes 
net.core.rmem_default = 8388608 
net.core.rmem_max = 16777216 
net.core.wmem_max = 16777216 
net.ipv4.tcp_rmem = 4096 87380 16777216 
net.ipv4.tcp_wmem = 4096 65536 16777216 
net.ipv4.tcp_congestion_control = cubic 
+0

您是否正在对负载均衡器服务器和后端服务器或其中一个进行这些更改? –

+0

请详细说明。每个nginx服务器都可以获得这些调整,如果数量很大的话。 – chrislovecnm

+0

我假设我有数据库负载平衡服务器(pgpool,而不是nginx服务器),它也应该获取设置,考虑到数据库连接将用于每个请求。相比之下,pgpool和postgres之间的连接不会承担这些设置,因为在pgpool和postgres之间建立了持久连接,因此不会为每个数据库请求建立新的tcp连接。这听起来正确吗? –

2

我发现使用最少连接算法存在问题。我切换到

hash $remote_addr consistent; 

并发现服务更快。