We changed jvm version to 1.5.x. After changed, this phenomenon don’t show yet. (about 3 weeks).
My team doubt many reason.
linux epoll + nio in jvm 1.6
jetty, mina
Because jstack show dead lock section. Some thread (except managed thread) live so long time.
748 "btpool1-1 - Acceptor0 SelectChannelConnector@0.0.0.0:7070" prio=10 tid=0x08874400 nid=0x5f1c runnable [0x637fe000.
.0x637ff140]
749 java.lang.Thread.State: RUNNABLE
750 at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
751 at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
752 at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
753 at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
754 - locked <0x759b8948> (a sun.nio.ch.Util$1)
755 - locked <0x759b8938> (a java.util.Collections$UnmodifiableSet)
756 - locked <0x759b84b0> (a sun.nio.ch.EPollSelectorImpl)
757 at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
758 at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:406)
759 at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:166)
760 at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
761 at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707)
762 at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)
763
764 "SocketAcceptor-1" prio=10 tid=0x0880e000 nid=0x5f1a runnable [0x639fe000..0x639fee40]
765 java.lang.Thread.State: RUNNABLE
766 at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
767 at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
768 at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
769 at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
770 - locked <0x759c3e48> (a sun.nio.ch.Util$1)
771 - locked <0x759c3e38> (a java.util.Collections$UnmodifiableSet)
772 - locked <0x759c3970> (a sun.nio.ch.EPollSelectorImpl)
773 at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
774 at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
775 at org.apache.mina.transport.socket.nio.SocketAcceptor$Worker.run(SocketAcceptor.java:220)
776 at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
777 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
778 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
779 at java.lang.Thread.run(Thread.java:619)
780
781 "SocketAcceptor-0" prio=10 tid=0x0880e800 nid=0x5f19 runnable [0x63be1000..0x63be1ec0]
782 java.lang.Thread.State: RUNNABLE
783 at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
784 at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
785 at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
786 at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
787 - locked <0x759c4690> (a sun.nio.ch.Util$1)
788 - locked <0x759c4680> (a java.util.Collections$UnmodifiableSet)
789 - locked <0x759c41c8> (a sun.nio.ch.EPollSelectorImpl)
790 at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
791 at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
792 at org.apache.mina.transport.socket.nio.SocketAcceptor$Worker.run(SocketAcceptor.java:220)
793 at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
794 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
795 at java.util.concurrent.ThreadPoolExecutor
We tried switching to java 1.5 but unfortunately once we increased the number of users the same problem occurred again. We upgraded to the latest openfire as well just in case but no luck.
For reference we are running on solaris 10 x86 on Dell hardware.
I don’t know if this has anything to do with it - but we’ve been sportatically having the same problem with our server. Turns out that a restart of the webserver without first shutting down openfire causes this behaviour as well.
We only noticed this today and will be monitoring this to see if this gets rid of our problems.