ThreadPoolExecutor使用錯誤導致死鎖

背景

  • 10月2號凌晨12:08收到報警,所有請求失敗,處于完全不可用狀態
  • 應用服務器共四臺resin,resin之前由四臺nginx做負載均衡

服務器現象及故障恢復步驟

  • 登入服務器,觀察resin進程,初看無任何異常,且占用資源正常,有非業務邏輯相關(一些schedule task)的日志輸出,但無業務邏輯相關的日志。
    • 表明resin服務器沒有在處理(新的)用戶的請求
  • 重啟resin,并觀察日志,發現resin開始處理業務,基本恢復
    • 表明重啟可以解決問題
  • 繼續依次重啟剩余的三臺resin,并在重啟最后一臺resin之前取jstack以供分析故障原因
    • jstack可以較好的反映進程正在進行的所有邏輯,可以有效的幫助定位問題,并且耗時較少通常5s內便可完成
    • 更全面的獲取進程信息的方法是取heap dump, 但耗時相對長一些,且分析時沒有jstack直觀(當然,由dump也可以得到jstack)
    • 若在進程有問題時直接重啟而不取jstack很可能會喪失定位問題的唯一時機 —— 此時已經基本可認定原因在resin進程內部,因此更有必要取jstack
    • 在重啟第四臺resin時才取jstack的原因是應該盡快恢復線上服務可用,因此重啟前三臺resin之前不應該做浪費時間的工作。但它們正常工作以后,線上的請求壓力可以先由它們來負荷,因此可以對第四臺進行一些快速并重要的操作后再重啟
  • 觀察,服務基本恢復
  • 看jstack文件,試圖找到線索
  • 突然又收到報警,發現四臺resin服務又恢復原來的故障狀態,所有請求均失敗。于是再次重啟前兩臺resin
  • 同時到nginx處觀察access log,發現所有請求均為502 bad gateway或timeout
    • 表明是nginx與resin之間的請求轉發有些問題,猜測像是請求沒發送給resin或是resin直接拒絕執行
  • 未確定具體原因,于是嘗試直接依次重啟四臺nginx, 并再重啟剩余的resin
  • 服務再次恢復,觀察一小時多后也不再出現問題
  • 至此,認為故障已經恢復

疑問

  • resin進程恢復后為何又會再次故障?
  • 最后一次上線是9月29號下午,如果是業務實現的問題,為何這之間一直沒出問題?
  • 重啟nginx究竟影響了什么?難道nginx與resin內部會存在什么神奇的邏輯關聯?

定位分析

  • 對服務異常時取的jstack進行分析(在機群上用我之前寫的一個stack分析腳本stackAnalysis,在命令行直接執行stackAnalysis即可), 發現resin的業務線程部分的summary如下:

      name=resin-port-*****-*****
      threadsNum=257
      hasUserThread=true
      blockNum=0
      execNum=0
      -------------------------------------------- threads groups with different state, size=4, start --------------------------------------------
      219 threads at (state = WAITING,
      locks_waiting = [0x00000006ed230e60, 0x00000006f0b19db0, ..., 0x00000006ec643d40, 0x00000006ed09ca38, 0x00000006eb0e8600, 0x00000006f1cf2118]) :
      "resin-port-*-*" daemon prio=* tid=******** nid=******** waiting on condition [********]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
          at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
          at java.util.concurrent.FutureTask.get(FutureTask.java:83)
          at com.xxxxxx.productfront.service.HomeService.getFilteredUserId(HomeService.java:933)
          at com.xxxxxx.productfront.service.HomeService.requestLatestTrends(HomeService.java:513)
          at com.xxxxxx.productfront.service.HomeService.getNewHomeDataFlow(HomeService.java:140)
          at com.xxxxxx.productfront.web.restful.HomeApiController.newInfo(HomeApiController.java:204)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$FastClassByCGLIB$$5502b6f1.invoke(<generated>)
          at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
          at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:689)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
          at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:93)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:622)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$EnhancerByCGLIB$$80fdc5c.newInfo(<generated>)
          at sun.reflect.GeneratedMethodAccessor1384.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:436)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:424)
          at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
          at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
          at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
          at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:159)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:97)
          at com.caucho.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:109)
          at com.xxxxxx.productfront.web.filter.FlagFilter.doFilter(FlagFilter.java:56)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.xxxxxx.productfront.web.filter.UserInfoFilter.doFilter(UserInfoFilter.java:37)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.caucho.server.webapp.WebAppListenerFilterChain.doFilter(WebAppListenerFilterChain.java:114)
          at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:156)
          at com.caucho.server.webapp.AccessLogFilterChain.doFilter(AccessLogFilterChain.java:95)
          at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:289)
          at com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:838)
          at com.caucho.network.listen.TcpSocketLink.dispatchRequest(TcpSocketLink.java:1341)
          at com.caucho.network.listen.TcpSocketLink.handleRequest(TcpSocketLink.java:1297)
          at com.caucho.network.listen.TcpSocketLink.handleRequestsImpl(TcpSocketLink.java:1281)
          at com.caucho.network.listen.TcpSocketLink.handleRequests(TcpSocketLink.java:1189)
          at com.caucho.network.listen.TcpSocketLink.handleAcceptTaskImpl(TcpSocketLink.java:985)
          at com.caucho.network.listen.ConnectionTask.runThread(ConnectionTask.java:117)
          at com.caucho.network.listen.ConnectionTask.run(ConnectionTask.java:93)
          at com.caucho.network.listen.SocketLinkThreadLauncher.handleTasks(SocketLinkThreadLauncher.java:169)
          at com.caucho.network.listen.TcpSocketAcceptThread.run(TcpSocketAcceptThread.java:61)
          at com.caucho.env.thread2.ResinThread2.runTasks(ResinThread2.java:173)
          at com.caucho.env.thread2.ResinThread2.run(ResinThread2.java:118)
    
      24 threads at (state = WAITING,
      locks_waiting = [0x00000006e56c9970, 0x00000006e56c6fd8, 0x00000006e28b14a8, 0x00000006e61dc9a0, 0x00000006e620e9c0, 0x00000006e3bf0648, 0x00000006e56d39e0, 0x00000006e6a051a0, 0x00000006e29fd0f0, 0x00000006e3b03728, 0x00000006e613b090, 0x00000006e5a67fc8, 0x00000006e7de7d90, 0x00000006e693a338, 0x00000006e46fc7c0, 0x00000006e28f41a0, 0x00000006e614a990, 0x00000006e61203d0, 0x00000006e45bd5f8, 0x00000006e5a65e90, 0x00000006e611feb0, 0x00000006e51d3b80, 0x00000006e4f23348, 0x00000006e692f150]) :
      "resin-port-*-*" daemon prio=* tid=******** nid=******** waiting on condition [********]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
          at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
          at java.util.concurrent.FutureTask.get(FutureTask.java:83)
          at com.xxxxxx.productfront.service.HomeService.getNewHomeDataFlow(HomeService.java:268)
          at com.xxxxxx.productfront.web.restful.HomeApiController.newInfo(HomeApiController.java:204)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$FastClassByCGLIB$$5502b6f1.invoke(<generated>)
          at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
          at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:689)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
          at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:93)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:622)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$EnhancerByCGLIB$$80fdc5c.newInfo(<generated>)
          at sun.reflect.GeneratedMethodAccessor1384.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:436)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:424)
          at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
          at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
          at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
          at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:159)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:97)
          at com.caucho.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:109)
          at com.xxxxxx.productfront.web.filter.FlagFilter.doFilter(FlagFilter.java:56)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.xxxxxx.productfront.web.filter.UserInfoFilter.doFilter(UserInfoFilter.java:37)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.caucho.server.webapp.WebAppListenerFilterChain.doFilter(WebAppListenerFilterChain.java:114)
          at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:156)
          at com.caucho.server.webapp.AccessLogFilterChain.doFilter(AccessLogFilterChain.java:95)
          at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:289)
          at com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:838)
          at com.caucho.network.listen.TcpSocketLink.dispatchRequest(TcpSocketLink.java:1341)
          at com.caucho.network.listen.TcpSocketLink.handleRequest(TcpSocketLink.java:1297)
          at com.caucho.network.listen.TcpSocketLink.handleRequestsImpl(TcpSocketLink.java:1281)
          at com.caucho.network.listen.TcpSocketLink.handleRequests(TcpSocketLink.java:1189)
          at com.caucho.network.listen.TcpSocketLink.handleAcceptTaskImpl(TcpSocketLink.java:985)
          at com.caucho.network.listen.ConnectionTask.runThread(ConnectionTask.java:117)
          at com.caucho.network.listen.ConnectionTask.run(ConnectionTask.java:93)
          at com.caucho.network.listen.SocketLinkThreadLauncher.handleTasks(SocketLinkThreadLauncher.java:169)
          at com.caucho.network.listen.TcpSocketAcceptThread.run(TcpSocketAcceptThread.java:61)
          at com.caucho.env.thread2.ResinThread2.runTasks(ResinThread2.java:173)
          at com.caucho.env.thread2.ResinThread2.run(ResinThread2.java:118)
    
      13 threads at (state = WAITING,
      locks_waiting = [0x00000006eba80d68, 0x00000006eba87ff0, 0x00000006eb013438, 0x00000006eba88088, 0x00000006eb9ea590, 0x00000006eb851ea8, 0x00000006eb851f40, 0x00000006eaecb790, 0x00000006eb851d20, 0x00000006eaecb190, 0x00000006eba8ea88, 0x00000006eb8b3b58, 0x00000006eb8b3a18]) :
      "resin-port-*-*" daemon prio=* tid=******** nid=******** waiting on condition [********]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
          at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
          at java.util.concurrent.FutureTask.get(FutureTask.java:83)
          at com.xxxxxx.productfront.service.HomeService.getFilteredUserId(HomeService.java:933)
          at com.xxxxxx.productfront.service.HomeService.requestLatestTrends(HomeService.java:513)
          at com.xxxxxx.productfront.service.HomeService.getNewHomeDataFlow(HomeService.java:140)
          at com.xxxxxx.productfront.web.restful.HomeApiController.newInfo(HomeApiController.java:204)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$FastClassByCGLIB$$5502b6f1.invoke(<generated>)
          at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
          at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:689)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
          at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:93)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:622)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$EnhancerByCGLIB$$80fdc5c.newInfo(<generated>)
          at sun.reflect.GeneratedMethodAccessor1384.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:436)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:424)
          at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
          at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
          at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
          at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:159)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:97)
          at com.caucho.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:109)
          at com.xxxxxx.productfront.web.filter.FlagFilter.doFilter(FlagFilter.java:56)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.xxxxxx.productfront.web.filter.UserInfoFilter.doFilter(UserInfoFilter.java:37)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.caucho.server.webapp.WebAppListenerFilterChain.doFilter(WebAppListenerFilterChain.java:114)
          at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:156)
          at com.caucho.server.webapp.AccessLogFilterChain.doFilter(AccessLogFilterChain.java:95)
          at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:289)
          at com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:838)
          at com.caucho.network.listen.TcpSocketLink.dispatchRequest(TcpSocketLink.java:1341)
          at com.caucho.network.listen.TcpSocketLink.handleRequest(TcpSocketLink.java:1297)
          at com.caucho.network.listen.TcpSocketLink.handleRequestsImpl(TcpSocketLink.java:1281)
          at com.caucho.network.listen.TcpSocketLink.handleRequests(TcpSocketLink.java:1189)
          at com.caucho.network.listen.TcpSocketLink.handleAcceptTaskImpl(TcpSocketLink.java:988)
          at com.caucho.network.listen.ConnectionTask.runThread(ConnectionTask.java:117)
          at com.caucho.network.listen.ConnectionTask.run(ConnectionTask.java:93)
          at com.caucho.network.listen.SocketLinkThreadLauncher.handleTasks(SocketLinkThreadLauncher.java:169)
          at com.caucho.network.listen.TcpSocketAcceptThread.run(TcpSocketAcceptThread.java:61)
          at com.caucho.env.thread2.ResinThread2.runTasks(ResinThread2.java:173)
          at com.caucho.env.thread2.ResinThread2.run(ResinThread2.java:118)
    
      1 threads at (state = WAITING,
      locks_waiting = [0x00000006e2b747e0]) :
      "resin-port-*-*" daemon prio=* tid=******** nid=******** waiting on condition [********]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
          at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
          at java.util.concurrent.FutureTask.get(FutureTask.java:83)
          at com.xxxxxx.productfront.service.HomeService.getNewHomeDataFlow(HomeService.java:270)
          at com.xxxxxx.productfront.web.restful.HomeApiController.newInfo(HomeApiController.java:204)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$FastClassByCGLIB$$5502b6f1.invoke(<generated>)
          at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
          at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:689)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
          at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:93)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
          at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
          at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:622)
          at com.xxxxxx.productfront.web.restful.HomeApiController$$EnhancerByCGLIB$$80fdc5c.newInfo(<generated>)
          at sun.reflect.GeneratedMethodAccessor1384.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:176)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:436)
          at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:424)
          at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
          at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
          at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
          at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:159)
          at javax.servlet.http.HttpServlet.service(HttpServlet.java:97)
          at com.caucho.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:109)
          at com.xxxxxx.productfront.web.filter.FlagFilter.doFilter(FlagFilter.java:56)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.xxxxxx.productfront.web.filter.UserInfoFilter.doFilter(UserInfoFilter.java:37)
          at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
          at com.caucho.server.webapp.WebAppListenerFilterChain.doFilter(WebAppListenerFilterChain.java:114)
          at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:156)
          at com.caucho.server.webapp.AccessLogFilterChain.doFilter(AccessLogFilterChain.java:95)
          at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:289)
          at com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:838)
          at com.caucho.network.listen.TcpSocketLink.dispatchRequest(TcpSocketLink.java:1341)
          at com.caucho.network.listen.TcpSocketLink.handleRequest(TcpSocketLink.java:1297)
          at com.caucho.network.listen.TcpSocketLink.handleRequestsImpl(TcpSocketLink.java:1281)
          at com.caucho.network.listen.TcpSocketLink.handleRequests(TcpSocketLink.java:1189)
          at com.caucho.network.listen.TcpSocketLink.handleAcceptTaskImpl(TcpSocketLink.java:985)
          at com.caucho.network.listen.ConnectionTask.runThread(ConnectionTask.java:117)
          at com.caucho.network.listen.ConnectionTask.run(ConnectionTask.java:93)
          at com.caucho.network.listen.SocketLinkThreadLauncher.handleTasks(SocketLinkThreadLauncher.java:169)
          at com.caucho.network.listen.TcpSocketAcceptThread.run(TcpSocketAcceptThread.java:61)
          at com.caucho.env.thread2.ResinThread2.runTasks(ResinThread2.java:173)
          at com.caucho.env.thread2.ResinThread2.run(ResinThread2.java:118)
    
      ----------------------------------------------- threads groups with different state, size=4, end -------------------------------------------
    
  • 查看resin的配置,可以知道我們配置的最大業務線程數為256:

      \#Throttle the number of active threads for a port
      port_thread_max   : 256
      accept_thread_max : 32
      accept_thread_min : 4
    
  • 同時從上面的stack summary結果中可以看到有257個業務邏輯的線程(以resin-port-開頭的線程組)處于waiting狀態,說明此時'''所有業務線程均在waiting狀態,從而新的業務請求自然會被resin拒絕,造成服務不可用'''

  • 從stack上可以看到所有waiting狀態的業務線程都是客戶端的首頁接口【 at com.xxxxxx.productfront.web.restful.HomeApiController.newInfo(HomeApiController.java:204)
    】,并且都是在對一個Future任務進行FutureTask.get()操作,同時由棧頂可以看出改這些線程都是在對一個隊列進行操作時無法繼續而使得線程waiting的:

      - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
      at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
      at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
    
  • 新的疑問產生了:這257個線程究竟在waiting什么?怎么可以這樣占著xx不xx!!!

  • 查看業務代碼HomeApiController.java相關邏輯發現該業務邏輯(獲取客戶端首頁數據,包括附近的人,推薦的人,新的動態等)為了提升性能,將沒有耦合的邏輯用future模式來處理以實現并行執行。同時為了節省線程創建銷毀的開銷,類中new了一個static的ThreadPoolExecutor用于接受各類futureTask的執行:
    protected static final ThreadPoolExecutor executor = new ThreadPoolExecutor(20, 100, 2, TimeUnit.MINUTES, new LinkedBlockingQueue<Runnable>(),
    new ThreadFactory() {
    private AtomicInteger id = new AtomicInteger(0);

                  @Override
                  public Thread newThread(Runnable r) {
                      Thread thread = new Thread(r);
                      thread.setName("home-service-" + id.addAndGet(1));
                      return thread;
                  }
              }, new ThreadPoolExecutor.CallerRunsPolicy());
    

且其構造函數相關含義為:

ThreadPoolExecutor
public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler)
Creates a new ThreadPoolExecutor with the given initial parameters.
Parameters:
corePoolSize - the number of threads to keep in the pool, even if they are idle.
maximumPoolSize - the maximum number of threads to allow in the pool.
keepAliveTime - when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating.
unit - the time unit for the keepAliveTime argument.
workQueue - the queue to use for holding tasks before they are executed. This queue will hold only the Runnable tasks submitted by the execute method.
threadFactory - the factory to use when the executor creates a new thread.
handler - the handler to use when execution is blocked because the thread bounds and queue capacities are reached.
Throws:
IllegalArgumentException - if corePoolSize or keepAliveTime less than zero, or if maximumPoolSize less than or equal to zero, or if corePoolSize greater than maximumPoolSize.
NullPointerException - if workQueue or threadFactory or handler are null.
  • 因為jstack有不少線程與隊列有關,為明確其構造函數中workQueue的使用,查看ThreadPoolExecutor的java doc:

      Queuing
      Any BlockingQueue may be used to transfer and hold submitted tasks. The use of this queue interacts with pool sizing:
          * If fewer than corePoolSize threads are running, the Executor always prefers adding a new thread rather than queuing.
          * If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread.
          * If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected.
      There are three general strategies for queuing:
          Direct handoffs. 
              A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. 
              Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. 
              This policy avoids lockups when handling sets of requests that might have internal dependencies. 
              Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. 
              This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed.
          Unbounded queues. 
              Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. 
              Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) 
              This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. 
              While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed.
          Bounded queues. 
              A bounded queue (for example, an ArrayBlockingQueue) helps prevent resource exhaustion when used with finite maximumPoolSizes, but can be more difficult to tune and control. 
              Queue sizes and maximum pool sizes may be traded off for each other: Using large queues and small pools minimizes CPU usage, OS resources, and context-switching overhead, but can lead to artificially low throughput. 
              If tasks frequently block (for example if they are I/O bound), a system may be able to schedule time for more threads than you otherwise allow. 
              Use of small queues generally requires larger pool sizes, which keeps CPUs busier but may encounter unacceptable scheduling overhead, which also decreases throughput.
    
  • LinkedBlockingQueue是沒有上限的隊列,由undounded queues的說明可知因為它的使用會導致maximumPoolSize的設置失效,即'''構造函數中傳入的100是沒有意義的''',只會有corePoolSize大小的線程數,也就是我們設置的20個線程存活在pool中。很明顯,這樣的結果與我們如此使用的預期是不相符的。

  • 發現了一些問題,上面的疑問也清晰了一些:很可能那257個線程是在等待ThreadPoolExecutor去完成任務,但由于只有20個線程在干活,因此這257個線程的futureTask被放到了LinkedBlockingQueue中,只有那20個線程做完手中的事之后這257個線程對應的futureTask才有機會被執行。

  • 于是,現在的疑問變成了:這20個線程究竟在做什么?雖然只有20個,但服務正常的時候它們應該非常快才對(不然問題早就出現了),但現在為什么那么慢以至于block住了所有的業務線程,'''它們究竟在做做什么?'''

  • 再回頭看stack summary的結果,發現確實有一個名為home-service-*的線程組,并且一共20個線程,這與上面100的maximumPoolSize的設置是無效的分析一致。看一下這個線程組各線程的具體狀態:

      name=home-service-**
      threadsNum=20
      hasUserThread=true
      blockNum=0
      execNum=0
      -------------------------------------------- threads groups with different state, size=3, start --------------------------------------------
      10 threads at (state = WAITING,
      locks_waiting = [0x00000006e59d3fe8, 0x00000006e4e848d0, 0x00000006e598b410, 0x00000006e4f90898, 0x00000006e674a510, 0x00000006e4e841b8, 0x00000006e4e845a0, 0x00000006e674a1e0, 0x00000006e4f90fc0, 0x00000006e4f90c90]) :
      "home-service-*" daemon prio=* tid=******** nid=******** waiting on condition [********]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
          at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
          at java.util.concurrent.FutureTask.get(FutureTask.java:83)
          at com.xxxxxx.productfront.service.HomeService.getExtendedTrend(HomeService.java:625)
          at com.xxxxxx.productfront.service.HomeService.buildTrend(HomeService.java:561)
          at com.xxxxxx.productfront.service.HomeService.access$000(HomeService.java:69)
          at com.xxxxxx.productfront.service.HomeService$2.call(HomeService.java:148)
          at com.xxxxxx.productfront.service.HomeService$2.call(HomeService.java:143)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:662)
    
      8 threads at (state = WAITING,
      locks_waiting = [0x00000006e692a260, 0x00000006e5027360, 0x00000006e5027428, 0x00000006e4f912f0, 0x00000006e4f913d8, 0x00000006e598b740, 0x00000006e598b808, 0x00000006e4f90bc8]) :
      "home-service-*" daemon prio=* tid=******** nid=******** waiting on condition [********]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
          at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
          at java.util.concurrent.FutureTask.get(FutureTask.java:83)
          at com.xxxxxx.productfront.service.HomeService.getExtendedRecommendInfo(HomeService.java:740)
          at com.xxxxxx.productfront.service.HomeService.buildRecommend(HomeService.java:453)
          at com.xxxxxx.productfront.service.HomeService$3.call(HomeService.java:242)
          at com.xxxxxx.productfront.service.HomeService$3.call(HomeService.java:237)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:662)
    
      2 threads at (state = WAITING,
      locks_waiting = [0x00000006e4e844e8, 0x00000006e4e84100]) :
      "home-service-*" daemon prio=* tid=******** nid=******** waiting on condition [********]
         java.lang.Thread.State: WAITING (parking)
          at sun.misc.Unsafe.park(Native Method)
          - parking to wait for  <********> (a java.util.concurrent.FutureTask$Sync)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
          at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
          at java.util.concurrent.FutureTask.get(FutureTask.java:83)
          at com.xxxxxx.productfront.service.HomeService.getExtendedNearByInfo(HomeService.java:876)
          at com.xxxxxx.productfront.service.HomeService.buildNearBy(HomeService.java:476)
          at com.xxxxxx.productfront.service.HomeService$4.call(HomeService.java:258)
          at com.xxxxxx.productfront.service.HomeService$4.call(HomeService.java:253)
          at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
          at java.util.concurrent.FutureTask.run(FutureTask.java:138)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:662)
    
      ----------------------------------------------- threads groups with different state, size=3, end -------------------------------------------
    
  • 仔細觀察stacktrace, 可以發現有個很奇怪的地方:這些線程都是從一個FutureTask.run開始,然后又調用了另一個FutureTask,并waiting在FutureTask.get()方法。

  • 也就是說,我們的代碼邏輯中,在FutureTask中又用了FutureTask, 這20個線程是在等待它下一層的FutureTask執行完成。那么,為什么它們的下一層沒有執行完成呢?

  • 通過HomeApiControoler.java可以看到,代碼中確實存在FutureTask內調FutureTask的情況,并且發現該類中所有的FutureTask(一共19種FutureTask)都是共用上面定義的threadPoolExecutor來執行的。

  • 此時一種擔憂油然而生:FutureTask內調FutureTask,并且都由同一個threadPoolExecutor來完成,靠譜嗎?會不會存在類似死鎖的問題?

  • 在做實驗之前答案并不確定,要看ThreadPoolExecutor是否能實現的這么好了。所以先看了下ThreadPoolExecutor的java doc,并注意其中關于類似死鎖相關邏輯的說明,果然,可以看到關于direct handoff的說明:

      There are three general strategies for queuing:
          Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. This policy avoids lockups(死鎖) when handling sets of requests that might have internal dependencies. Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed.
    
  • 從中可以知道用direct handoffs的方法可以避免線程池因為內部純程間的依賴而造成的死鎖

  • 至此,答案已經較為明了,簡言之:業務線程在占用了線程池內所有的資源后又向線程池提交了新的任務,并且要等這些任務完成后才釋放資源,而這些新提交的任務根本就沒機會被完成!!!

  • 好了,我們來驗證一下是否確實會這樣:

    ...
       protected static final ThreadPoolExecutor executor = new ThreadPoolExecutor(2, 100, 2, TimeUnit.MINUTES, new LinkedBlockingQueue<Runnable>(),
                new ThreadFactory() {
                    private AtomicInteger id = new AtomicInteger(0);

                    @Override
                    public Thread newThread(Runnable r) {
                        Thread thread = new Thread(r);
                        thread.setName("home-service-" + id.addAndGet(1));
                        return thread;
                    }
                }, new ThreadPoolExecutor.CallerRunsPolicy());

       public static void main(String[] args) throws InterruptedException, ExecutionException {
            Future<Long> f1 = executor.submit(new Callable<Long>() {
                @Override
                public Long call() throws Exception {
                    Thread.sleep(1000); //延時以使得第二層的f3在第一層的f2占用corePoolSize后才submit
                    Future<Long> f3 = executor.submit(new Callable<Long>() {
                        @Override
                        public Long call() throws Exception {

                            return -1L;
                        }
                    });
                    System.out.println("f1.f3" + f3.get());
                    return -1L;
                }
            });
            Future<Long> f2 = executor.submit(new Callable<Long>() {
                @Override
                public Long call() throws Exception {
                    Thread.sleep(1000);//延時
                    Future<Long> f4 = executor.submit(new Callable<Long>() {
                        @Override
                        public Long call() throws Exception {

                            return -1L;
                        }
                    });
                    System.out.println("f2.f4" + f4.get());
                    return -1L;
                }
            });
            System.out.println("here");
            System.out.println("f1" + f1.get());
            System.out.println("f2" + f2.get());
        }
    ...

運行程序后發現只打出一行日志"here", 并且進程沒有結束,取其jstack:

    ...
    "home-service-2" prio=5 tid=7fe2aa15b800 nid=0x10dcef000 waiting on condition [10dcee000]
       java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <7f371ac80> (a java.util.concurrent.FutureTask$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
        at com.xxxxxx.productfront.service.HomeService$3.call(HomeService.java:136)
        at com.xxxxxx.productfront.service.HomeService$3.call(HomeService.java:1)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:695)

    "home-service-1" prio=5 tid=7fe2aa0b8800 nid=0x10dbec000 waiting on condition [10dbeb000]
       java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <7f36c5800> (a java.util.concurrent.FutureTask$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
        at com.xxxxxx.productfront.service.HomeService$2.call(HomeService.java:121)
        at com.xxxxxx.productfront.service.HomeService$2.call(HomeService.java:1)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:695)
    ...
    "main" prio=5 tid=7fe2ab000800 nid=0x1051c9000 waiting on condition [1051c8000]
       java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <7f369cab0> (a java.util.concurrent.FutureTask$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
        at com.xxxxxx.productfront.service.HomeService.main(HomeService.java:141)
    ...

可見確實會造成'''死鎖''',而如果去掉f1和f2內的sleep延時,而把延時放到f2之前,則f1, f2, f3, f4均能成功完成。

結論

  • 因為threadPoolExecutor使用了內部依賴,使得業務線程在占用了線程池內所有的資源后又向線程池提交了新的任務,并且要等這些任務完成后才釋放資源,而這些新提交的任務根本就沒機會被完成,從而造成業務線程沒法完成,進而導致web服務器的業務線程耗盡,不再接受新的業務請求,即整個服務不可用。

如何處理

  • 解決死鎖:

    • 方法一:去掉threadPoolExecutor的內部依賴,每一層的future用各種的threadPoolExecutor,但現在各層層次較亂,即使現在理清了也很容易在之后被勿用。
    • 方法二:將maxPoolSize設為maximumPoolSize —— 不建議,耗用資源太多
    • 方法三:將queue的長度設為最小(貌似不能設為0,于是將其設為1),這樣maxPoolSize就會生效,同時設置ThreadPoolExecutor.CallerRunsPolicy()會使得任務被pool拒絕時轉由當前線程本地執行。
  • 另外,初始化ThreadPoolExecutor時用的ThreadPoolExecutor.CallerRunsPolicy()也是沒意義的,見java doc:

      Rejected tasks
      New tasks submitted in method execute(java.lang.Runnable) will be rejected when the Executor has been shut down, and also when the Executor uses finite bounds for both maximum threads and work queue capacity, and is saturated. In either case, the execute method invokes the RejectedExecutionHandler.rejectedExecution(java.lang.Runnable, java.util.concurrent.ThreadPoolExecutor) method of its RejectedExecutionHandler. Four predefined handler policies are provided:
          * In the default ThreadPoolExecutor.AbortPolicy, the handler throws a runtime RejectedExecutionException upon rejection.
          * In ThreadPoolExecutor.CallerRunsPolicy, the thread that invokes execute itself runs the task. This provides a simple feedback control mechanism that will slow down the rate that new tasks are submitted.
          * In ThreadPoolExecutor.DiscardPolicy, a task that cannot be executed is simply dropped.
          * In ThreadPoolExecutor.DiscardOldestPolicy, if the executor is not shut down, the task at the head of the work queue is dropped, and then execution is retried (which can fail again, causing this to be repeated.)
      It is possible to define and use other kinds of RejectedExecutionHandler classes. Doing so requires some care especially when policies are designed to work only under particular capacity or queuing policies.
    
  • 綜合考慮,較好的使用方式是保持internal dependency并且如下設置ThreadPoolExecutor:

      ThreadPoolExecutor(20, 100, keepAliveTime, unit, new LinkedBlockingQueue<Runnable>(1), threadFactory)
    

參考資料

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 228,238評論 6 531
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 98,430評論 3 415
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 176,134評論 0 373
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 62,893評論 1 309
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 71,653評論 6 408
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 55,136評論 1 323
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,212評論 3 441
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 42,372評論 0 288
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 48,888評論 1 334
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 40,738評論 3 354
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 42,939評論 1 369
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,482評論 5 359
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,179評論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,588評論 0 26
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 35,829評論 1 283
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 51,610評論 3 391
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 47,916評論 2 372

推薦閱讀更多精彩內容