記一次Linux的性能排查

服務(wù)器有6臺騰訊云的機器。有一天無意隨便登錄一臺使用vmstat命令查看CPU和內(nèi)存的消耗情況:

[root@VM_26_210_centos ~]# vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  41572 229160 399080 3666708    0    0     0    10    0    0  1  0 99  0  0
[root@VM_26_210_centos ~]# vmstat 2 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  41572 230096 399080 3666820    0    0     0    10    0    0  1  0 99  0  0
[root@VM_26_210_centos ~]# vmstat 2
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  41572 229880 399080 3666840    0    0     0    10    0    0  1  0 99  0  0
 0  0  41572 229748 399080 3666840    0    0     0    28  791 1221  1  0 99  0  0
 0  0  41572 229616 399080 3666840    0    0     0     0  895 1305  1  1 98  0  0
 0  0  41572 229368 399080 3666840    0    0     0  4542  801 1294  0  0 98  1  0
 0  0  41572 229376 399080 3666848    0    0     0    20  811 1251  1  1 99  0  0
 0  0  41572 229384 399080 3666848    0    0     0     0  745 1206  0  1 99  0  0
 0  0  41572 229376 399080 3666848    0    0     0   110  831 1298  1  0 99  0  0
 0  0  41572 229616 399080 3666852    0    0     0     0 1741 2634  2  1 97  0  0
 0  0  41572 229624 399080 3666852    0    0     0     4  769 1255  1  0 99  0  0
 

嚇了我一跳:服務(wù)器是4核8G的內(nèi)存。vmstat一看只有兩百多兆了。說明內(nèi)存已經(jīng)不夠。

然后騰訊云上的監(jiān)控是這樣的:

騰訊云.jpg

騰訊云監(jiān)控顯示的內(nèi)存竟然是只使用了50%,這個時候我就很奇怪了。肯定是哪里有問題,于是我使用top命令查看了當(dāng)前機器的狀態(tài):

[root@VM_26_210_centos ~]# top
top - 13:32:02 up 659 days,  3:06,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 136 total,   1 running, 135 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.9%us,  1.1%sy,  0.0%ni, 97.8%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8059448k total,  7826428k used,   233020k free,   399080k buffers
Swap:  2097144k total,    41572k used,  2055572k free,  3668692k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                         
22539 root      20   0 8318m 1.8g  14m S  1.7 23.9 452:18.77 java                                                                                                                                                             
20117 root      20   0  7168 6332  660 S  0.7  0.1 179:56.95 sap1002                                                                                                                                                          
  873 root      20   0  246m 5476  812 S  0.3  0.1   6:46.02 rsyslogd                                                                                                                                                         
10618 root      20   0 37868  17m  984 S  0.3  0.2 153:26.54 secu-tcs-agent                                                                                                                                                   
14448 root      20   0 5590m 477m  12m S  0.3  6.1 107:43.34 java                                                                                                                                                             
16980 root      20   0 39016  22m 5576 S  0.3  0.3 293:10.41 sap1009                                                                                                                                                          
17857 root      20   0 4384m 452m  11m S  0.3  5.8 538:17.84 java                                                                                                                                                             
22349 root      20   0 5569m 467m  12m S  0.3  5.9  84:30.18 java                                                                                                                                                             
27931 root      20   0  427m  13m 2084 S  0.3  0.2 325:52.53 barad_agent                                                                                                                                                      
29121 root      20   0 33468  15m 1052 S  0.3  0.2  83:37.27 sap1005                                                                                                                                                          
    1 root      20   0 19356  932  716 S  0.0  0.0   2:21.78 init                                                                                                                                                             
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd                                                                                                                                                         
    3 root      RT   0     0    0    0 S  0.0  0.0   3:29.40 migration/0                                                                                                                                                      
    4 root      20   0     0    0    0 S  0.0  0.0   4:45.83 ksoftirqd/0                                                                                                                                                      
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0                                                                                                                                                      
    6 root      RT   0     0    0    0 S  0.0  0.0   1:13.64 watchdog/0                                                                                                                                                       
    7 root      RT   0     0    0    0 S  0.0  0.0   3:21.08 migration/1                                                                                                                                                      
    8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/1                                                                                                                                                      
    9 root      20   0     0    0    0 S  0.0  0.0   4:10.62 ksoftirqd/1                                                                                                                                                      
   10 root      RT   0     0    0    0 S  0.0  0.0   0:58.95 watchdog/1                                                                                                                                                       
   11 root      RT   0     0    0    0 S  0.0  0.0   3:07.85 migration/2                                                                                                                                                      
   12 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/2                                                                                                                                                      
   13 root      20   0     0    0    0 S  0.0  0.0   4:19.19 ksoftirqd/2                                                                                                                                                      
   14 root      RT   0     0    0    0 S  0.0  0.0   1:00.61 watchdog/2                                                                                                                                                       
   15 root      RT   0     0    0    0 S  0.0  0.0   3:06.14 migration/3                                                                                                                                                      
   16 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/3                                                                                                                                                      
   17 root      20   0     0    0    0 S  0.0  0.0   5:30.66 ksoftirqd/3                                                                                                                                                      
   18 root      RT   0     0    0    0 S  0.0  0.0   1:00.14 watchdog/3                                                                                                                                                       
   19 root      20   0     0    0    0 S  0.0  0.0  26:36.90 events/0                                                                                                                                                         
   20 root      20   0     0    0    0 S  0.0  0.0  26:37.71 events/1                                                                                                                                                         
   21 root      20   0     0    0    0 S  0.0  0.0  33:52.49 events/2                                                                                                                                                         
   22 root      20   0     0    0    0 S  0.0  0.0  37:57.76 events/3                                                                                                                                                         
   23 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cgroup                                                                                                                                                           
   24 root      20   0     0    0    0 S  0.0  0.0   0:11.72 khelper      

mem 行顯示還是只有兩百多兆的剩余內(nèi)存。然后只查看內(nèi)存:

[root@VM_26_210_centos ~]# free -m
             total       used       free     shared    buffers     cached
Mem:          7870       7625        245          0        389       3575
-/+ buffers/cache:       3660       4210
Swap:         2047         40       2007

這下確定了,肯定是騰訊云的監(jiān)控使用問題的。

于是打電話給騰訊云。折騰了一下午,騰訊云反饋說他們的內(nèi)存計算是不計算 buffer 和 cache的。

那么在vmstat中,buffer和cache到底是什么呢?
這里我直接引用http://www.cnblogs.com/chenshoubiao/p/4796664.html這篇博客:

A buffer is something that has yet to be "written" to disk.
A cache is something that has been "read" from the disk and stored for later use.
也就是說buffer是用于存放要輸出到disk(塊設(shè)備)的數(shù)據(jù)的,而cache是存放從disk上讀入的數(shù)據(jù)。這二者是為了提高IO性能的,并由OS管理。

那么在vmstat中,用于輸出的緩存的大概是三百多M,從硬盤讀入的數(shù)據(jù)是則是3個多G。

那么真正被使用的內(nèi)存就是差不多4個G作用。統(tǒng)計一下top命令中RES的和,是3.5個G。

這個時候就擔(dān)心兩個問題了:

  • 1.為什么有這么大的 cache?對性能有什么影響呢?
  • 2.只有兩百多m的free,影響JVM的性能嗎?

從 vmstat來看,si (每秒從磁盤讀入虛擬內(nèi)存的大小,如果這個值大于0,表示物理內(nèi)存不夠用或者內(nèi)存泄露了,要查找耗內(nèi)存進(jìn)程解決掉。我的機器內(nèi)存充裕,一切正常)與
so (每秒虛擬內(nèi)存寫入磁盤的大小,如果這個值大于0,同上。)都是正常的。就是說沒有發(fā)生分頁交換,JVM在垃圾回收的時候要掃描所有的堆,如果發(fā)生分頁交換,JVM回收垃圾的性能就會大大下降。
對比每兩秒輸出GC情況,通過 jstat命令來看垃圾回收也是正常的:

[root@VM_26_210_centos ~]# jstat -gccause  22539 2000 
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT    LGCC                 GCC                 
 87.12   0.00  91.04  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  91.39  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.45  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.57  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.58  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.65  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.83  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.84  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.84  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.86  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.92  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.07  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.08  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.09  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.58  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.65  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  94.36  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  94.37  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
  0.00  84.94  40.97  57.69  95.37  91.61  16913 1148.072     5    1.371 1149.443 Allocation Failure   No GC        

http://www.cnblogs.com/kevingrace/p/5991604.html

http://blog.sina.com.cn/s/blog_9c6f23fb0102x1fg.html

從操作系統(tǒng)來講,影響JVM性能有哪些因素?

1.頁面交換
2.上下文切換

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

推薦閱讀更多精彩內(nèi)容