Connection drops for some users due to an unknown cause

Hi,

We’re currently beta-testing our brand new Openfire and StropheJs-based AJAX Chat and some of our users are expierencing problems with the connection.

They tell us, that they have every few minutes connection drops (some stay longer, some stay only some minutes) and they can sometimes still send messages, but do not receive any msgs anymore. After they reconnect, they see their previously sent but not received messages.

We have no clue what this could cause.

We currently have Openfire 3.7.0 running on one machine (no clustering, no connection managers etc.).

The following appears in our warn log:

2011.06.09 13:46:09 Client provided invalid session: 157d91ed. [192.168.189.95]
2011.06.09 13:46:10 Packets could not be found for session 222b74ec cannotbe delivered to client
2011.06.09 13:46:12 Client provided invalid session: d815c53a. [192.168.189.95]
2011.06.09 13:46:31 Client provided invalid session: 50fc92e. [192.168.189.78]
2011.06.09 13:46:49 Client provided invalid session: 5694e3a5. [192.168.189.95]
2011.06.09 13:46:59 Client provided invalid session: ca988449. [192.168.189.95]
2011.06.09 13:47:10 Packets could not be found for session 2dcef0c1 cannotbe delivered to client
2011.06.09 13:47:33 Client provided invalid session: 105fe10b. [192.168.189.95]
2011.06.09 13:47:34 Client provided invalid session: 2a059ccc. [192.168.189.95]
2011.06.09 13:47:40 Packets could not be found for session 132bdb68 cannotbe delivered to client
2011.06.09 13:47:44 Client provided invalid session: 9e1fbd8e. [192.168.189.78]
2011.06.09 13:48:13 Client provided invalid session: fd01bb10. [192.168.189.95]
2011.06.09 13:48:21 Client provided invalid session: 6e78ac39. [192.168.189.95]
2011.06.09 13:48:45 Packets could not be found for session ddffdd77 cannotbe delivered to client
2011.06.09 13:48:45 Client provided invalid session: 2d1449e2. [192.168.189.95]
2011.06.09 13:49:19 Client provided invalid session: 4f0800c7. [192.168.189.95]
2011.06.09 13:50:10 Packets could not be found for session 813299e2 cannotbe delivered to client
2011.06.09 13:50:16 Packets could not be found for session bfc76623 cannotbe delivered to client
2011.06.09 13:50:31 Client provided invalid session: 8c816f79. [192.168.189.95]
2011.06.09 13:50:35 Client provided invalid session: 91a6a2d2. [192.168.189.95]
2011.06.09 13:53:16 Client provided invalid session: 9f170508. [192.168.189.95]
2011.06.09 13:53:33 Client provided invalid session: 699f020c. [192.168.189.95]
2011.06.09 13:53:33 Client provided invalid session: 699f020c. [192.168.189.95]
2011.06.09 13:53:39 Client provided invalid session: e9032580. [192.168.189.95]
2011.06.09 13:53:41 Client provided invalid session: 132bdb68. [192.168.189.95]
2011.06.09 13:54:10 Packets could not be found for session 94755721 cannotbe delivered to client
2011.06.09 13:55:40 Packets could not be found for session 2f1b09d3 cannotbe delivered to client
2011.06.09 13:55:40 Packets could not be found for session d84b9f67 cannotbe delivered to client
2011.06.09 13:55:58 Client provided invalid session: 2c03ec3f. [192.168.189.95]
2011.06.09 13:56:10 Packets could not be found for session fa70eceb cannotbe delivered to client
2011.06.09 13:56:27 Packets could not be found for session 6eb4b53c cannotbe delivered to client
2011.06.09 13:56:28 Client provided invalid session: 9c60c5d0. [192.168.189.95]
2011.06.09 13:56:46 Client provided invalid session: aa469198. [192.168.189.95]
2011.06.09 13:57:15 Client provided invalid session: e3541560. [192.168.189.95]
2011.06.09 13:57:25 Client provided invalid session: e3541560. [192.168.189.95]
2011.06.09 13:57:40 Packets could not be found for session a0bb2219 cannotbe delivered to client
2011.06.09 13:57:45 Client provided invalid session: 3e01d4bc. [192.168.189.78]
2011.06.09 13:58:03 Client provided invalid session: f01d646a. [192.168.189.78]
2011.06.09 13:58:19 Client provided invalid session: 325de39a. [192.168.189.78]
2011.06.09 13:58:41 Client provided invalid session: 925a8cb. [192.168.189.95]
2011.06.09 13:58:48 Client provided invalid session: fc56e0e2. [192.168.189.95]
2011.06.09 13:58:49 Client provided invalid session: fc56e0e2. [192.168.189.95]
2011.06.09 13:58:58 Client provided invalid session: 5470e04e. [192.168.189.95]
2011.06.09 13:59:40 Client provided invalid session: a5b48041. [192.168.189.95]
2011.06.09 13:59:49 Client provided invalid session: 12f21bbe. [192.168.189.95]
2011.06.09 14:00:20 Client provided invalid session: 17fda9a2. [192.168.189.95]
2011.06.09 14:00:33 Client provided invalid session: 21b19ece. [192.168.189.95]
2011.06.09 14:00:56 Client provided invalid session: dd23c4bc. [192.168.189.95]
2011.06.09 14:01:13 Client provided invalid session: 7cfa04b. [192.168.189.95]
2011.06.09 14:01:56 Client provided invalid session: 43da9088. [192.168.189.95]

Sometimes also this appears:

Error for /http-bind/

java.lang.InternalError: Could not locate connection: 1026551200                                 at org.jivesoftware.openfire.http.HttpSession.getResponse(HttpSession.java:583)                                 at org.jivesoftware.openfire.http.HttpBindServlet.handleSessionRequest(HttpBindServlet.java:285)                                 at org.jivesoftware.openfire.http.HttpBindServlet.parseDocument(HttpBindServlet.java:168)                                 at org.jivesoftware.openfire.http.HttpBindServlet.doPost(HttpBindServlet.java:137)                                     at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)                                     at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)                                     at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:530)                                     at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1216)                                     at org.eclipse.jetty.continuation.ContinuationFilter.doFilter(ContinuationFilter.java:110)                                     at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1187)                                     at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:425)                                     at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:931)                                     at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:362)                                     at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:867)                                     at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)                                     at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:245)                                     at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)                                     at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)                                     at org.eclipse.jetty.server.Server.handle(Server.java:334)                                     at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:559)                                     at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:1007)                                     at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:747)                                     at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:209)                                     at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:406)                                     at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:462)                                     at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)                                     at java.lang.Thread.run(Thread.java:662)

We already found similar issues from other users of this forum but the threads are all very old or do not have any solution.

For testing we configured div. things for the httpbind settings, take a look at our current setup:

            xmpp.client.idle               

360000

            xmpp.client.idle.ping               

true

xmpp.httpbind.client.idle

300

            xmpp.httpbind.client.requests.polling               

0

            xmpp.httpbind.client.requests.wait               

30

            xmpp.httpbind.scriptSyntax.enabled               

true

Does anyone have an idea what the problem of this could be?

I would really appreciate if someone could give me any hints or ideas how to fix or debug this.

We also thought that it could be a Problem on the user’s pcs (too much CPU in use or something like that) but we don’t really know…

Thanks,

Michael

Try also setting xmpp.httpbind.client.requests.ignoreOveractivity to true.

There are many bugs here with BOSH, figuring out how to fix them is a huge challenge for me

daryl

Hi,

Thank you very much for the reply, I added this property and I’ll test it now. I hope it works

Yeah I can imagine this (about fixing the bugs). Maybe I could help, I do not have much experience with the openfire code but I can do java

So if you have any idea about what’s going wrong, tell me… Maybe I could take a look.

Cheers,

Michael

Hi Michael,

Thanks, we are desperate for more folks helping with openfire’s code. The problem is with the session logic within src/java/org/jivesoftware/openfire/http . If you have a ‘perfect’ internet connection, it works just fine, but if you get any blips or transient things, the connection is dropped. Openfire also does not like getting two requests for the same request_id. One trick is to be less strigent at about line 689 of HttpSession.java . There was a guy on the support chat that said he had some patches about this, but I have yet to hear back from him.

daryl

After activating the overactivity-setting (I don’t know if there are less users which are having the problem, I’ll see that today evening maybe), I receive the following errors:

2011.06.09 14:32:37 /http-bind/                                java.lang.IllegalStateException: IDLE,initial                                     at org.eclipse.jetty.server.AsyncContinuation.suspend(AsyncContinuation.java:280)                                     at org.eclipse.jetty.server.AsyncContinuation.suspend(AsyncContinuation.java:776)                                 at org.jivesoftware.openfire.http.HttpConnection.waitForResponse(HttpConnection.java:202)                                 at org.jivesoftware.openfire.http.HttpConnection.getResponse(HttpConnection.java:144)                                 at org.jivesoftware.openfire.http.HttpSession.getResponse(HttpSession.java:589)                                 at org.jivesoftware.openfire.http.HttpSession.getResponse(HttpSession.java:573)                                 at org.jivesoftware.openfire.http.HttpBindServlet.isContinuation(HttpBindServlet.java:181)                                 at org.jivesoftware.openfire.http.HttpBindServlet.doPost(HttpBindServlet.java:133)                                     at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)                                     at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)                                     at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:530)                                     at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1216)                                     at org.eclipse.jetty.continuation.ContinuationFilter.doFilter(ContinuationFilter.java:110)                                     at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1187)                                     at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:425)                                     at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:931)                                     at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:362)                                     at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:867)                                     at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)                                     at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:183)                                     at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)                                     at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)                                     at org.eclipse.jetty.server.Server.handleAsync(Server.java:377)                                     at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:564)                                     at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:390)                                     at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:462)                                     at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)                                     at java.lang.Thread.run(Thread.java:662)                                2011.06.09 14:32:55 Client provided invalid session: ba3cae82. [192.168.189.78]                                2011.06.09 14:33:13 Packets could not be found for session d60386cf cannotbe delivered to client                                2011.06.09 14:33:13 Packets could not be found for session d60386cf cannotbe delivered to client                                2011.06.09 14:33:13 Request 1285868961 > 1285868960, ending session.                                2011.06.09 14:33:43 Client provided invalid session: 4f20b531. [192.168.189.78]

I have the feeling that it’s a little bit better now, but I cannot confirm that really at the moment…

  • Michael

Ok, that points me somehow a little bit further… Maybe i could debug it a little bit… But as you said, it’s hard to figure out a fix for it…

Yeah, see those too Another, desperate option, is to look at punjab and use it as a ‘connection manager’ to openfire. I have not tried it, but some have suggested it to me. Hopefully we can figure out what’s wrong with openfire and fix it

daryl

Because of the

Request 1285868961 > 1285868960, ending session.

Error, I set now

“xmpp.httpbind.client.requests.max”

to 5 because of line 690 in HttpSession.java

Maybe this helps somehow.

Daryl, do you know if I need to restart my server in order to activate those settings or does openfire this reload for every session created?

I’m asking because I’m testing this on our beta test with real users

I believe it is set for new connections, restart should not be needed.

Ok, thanks.

The increase of requests.max does also not help in fixing the problem

We will further test things, I hope we will figure it out… If you have any further ideas, please tell me

Do you know why the log.httpbind.enabled-settings do print the logging-statements to standard out instead of printing it into the debug log?

  • Michael

Hi Daryl

I’m a co-worker of Michael and he’s in holiday right now.

Do you have any news about this issue? Is there a change to get the patches you mentioned in this thread?

Greetings

Patrick

Hello,

I’ve been working on this for a few months now, it is slow going because I barely have a clue

daryl

Daryl,

I know it’s not a real solution but we’re now using punjab and disabled the integrated HTTP binding module. Looks very good at the moment!

Greetings

Patrick

Hi Patrick,

That’s great to hear. Please let me know how it works out. I haven’t figured out that path yet, but should look into it soon.

daryl

Daryl,

Yesterday we had a peak of 150+ users. Not one complaint so far. Punjab seems to be very efficient and stable. The server load and memory usage have been low the whole time. Definitely the way to go.

You should give it a try. If you need some more details let me know.

Regs,

Patrick

Thanks Patrick,

I got it running yesterday as well and have 500 some users on it now, looks good so far.

daryl

That’s great. We had a peak of about 200 users yesterday (all in one MUC room by the way) and absolutely no problems.

I have a good feeling about our current setup, but it’s good to know about its limits, so please tell me when something goes wrong. I will do the same of course.

Cheers

Hey guys,

I’ve got this error bellow using Jappix Mini with JsJaC on a server with Openfire + Punjab… I don’t know what is happening in this case…

2011-09-05 14:22:01-0300 [HTTPChannel,1,10.1.1.10] HEADERS 1315243321.04:

2011-09-05 14:22:01-0300 [HTTPChannel,1,10.1.1.10] {‘origin’: ‘http://prisma’, ‘content-length’: ‘264’, ‘via’: ‘1.1 10.1.1.10 (IIRF v2.1)’, ‘accept-language’: ‘pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4’, ‘accept-encoding’: ‘gzip,deflate,sdch’, ‘x-forwarded-host’: ‘prisma’, ‘x-forwarded-for’: ‘10.1.1.27’,‘host’: ‘prisma’, ‘accept’: ‘/’, ‘user-agent’: ‘Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.220 Safari/535.1’, ‘accept-charset’: ‘ISO-8859-1,utf-8;q=0.7,*;q=0.3’, ‘connection’: ‘keep-alive’, ‘referer’: ‘http://prisma/nopcommercestore_chatatheneum/Default.aspx’, ‘cookie’: ‘.ASPXANONYMOUS=RXa54IGizAEkAAAANTE0YTBhYWQtYWJmNC00NWM4LTk2ZDAtOTk3Mjc4MTA0N2N kp_aaBeFasBLVRpBRjNDRRzdFC4syOuqAvM5Hfwv7cVk1; ASP.NET_SessionId=ziuxnu2oigdu2yuvfrppcd50; NOPCOMMERCE.AUTH=6EE583DE9475FC05BC0209121573AD1023CB9439FE52DF22C4D3397B973070 EF09E0E313940BFB564E7C868D6DB6C9423C1F02BDD2F225B361E0582B669FBAA7E9588D9E933492 445F7550E64904ED6D172F8D96E5C420544642684E91701F26706478797276B3AF845506434BC8A5 21A19E7D1603BD4911B890BC11FF8933D7395335245A5FE640367C1722A41AF666; Nop.CustomerSessionGUIDCookie=ca15bbed-e363-4029-889d-e25f24978c12’, ‘x-forwarded-server’: ‘prisma’, ‘content-type’: ‘text/xml; charset=UTF-8’}

2011-09-05 14:22:01-0300 [HTTPChannel,1,10.1.1.10] HTTPB POST :

2011-09-05 14:22:01-0300 [HTTPChannel,1,10.1.1.10]

2011-09-05 14:22:01-0300 [HTTPChannel,1,10.1.1.10] Session Created : 15afc413e7ece8f7015df56130a25453e474103c 1315243321.04

2011-09-05 14:22:01-0300 [HTTPChannel,1,10.1.1.10] ================================== 1315243321.04 connect to prisma:5222 ==================================

2011-09-05 14:22:01-0300 [HTTPChannel,1,10.1.1.10] Starting factory <punjab.session.Session object at 0x016043B0>

2011-09-05 14:22:01-0300 [-] <class ‘twisted.names.dns.DNSDatagramProtocol’> starting on 64433

2011-09-05 14:22:01-0300 [-] Starting protocol <twisted.names.dns.DNSDatagramProtocol object at 0x01604150>

2011-09-05 14:22:01-0300 [twisted.names.dns.DNSDatagramProtocol (UDP)] <punjab.session.XMPPClientConnector instance at 0x01609030> will retry in 2 seconds

2011-09-05 14:22:01-0300 [twisted.names.dns.DNSDatagramProtocol (UDP)] Stopping factory <punjab.session.Session object at 0x016043B0>

2011-09-05 14:22:01-0300 [-] (Port 64433 Closed)

2011-09-05 14:22:01-0300 [-] Stopping protocol <twisted.names.dns.DNSDatagramProtocol object at 0x01604150>

2011-09-05 14:22:03-0300 [-] Starting factory <punjab.session.Session object at 0x016043B0>

2011-09-05 14:22:03-0300 [-] <class ‘twisted.names.dns.DNSDatagramProtocol’> starting on 5566

2011-09-05 14:22:03-0300 [-] Starting protocol <twisted.names.dns.DNSDatagramProtocol object at 0x01604F50>

2011-09-05 14:22:03-0300 [twisted.names.dns.DNSDatagramProtocol (UDP)] <punjab.session.XMPPClientConnector instance at 0x01609030> will retry in 6 seconds

2011-09-05 14:22:03-0300 [twisted.names.dns.DNSDatagramProtocol (UDP)] Stopping factory <punjab.session.Session object at 0x016043B0>

2011-09-05 14:22:03-0300 [-] (Port 5566 Closed)

2011-09-05 14:22:03-0300 [-] Stopping protocol <twisted.names.dns.DNSDatagramProtocol object at 0x01604F50>

2011-09-05 14:22:10-0300 [-] Starting factory <punjab.session.Session object at 0x016043B0>

2011-09-05 14:22:10-0300 [-] <class ‘twisted.names.dns.DNSDatagramProtocol’> starting on 64985

2011-09-05 14:22:10-0300 [-] Starting protocol <twisted.names.dns.DNSDatagramProtocol object at 0x01604170>

2011-09-05 14:22:10-0300 [twisted.names.dns.DNSDatagramProtocol (UDP)] Abandoning <punjab.session.XMPPClientConnector instance at 0x01609030> after 3 retries.

2011-09-05 14:22:10-0300 [twisted.names.dns.DNSDatagramProtocol (UDP)] Stopping factory <punjab.session.Session object at 0x016043B0>

2011-09-05 14:22:10-0300 [-] (Port 64985 Closed)

2011-09-05 14:22:10-0300 [-] Stopping protocol <twisted.names.dns.DNSDatagramProtocol object at 0x01604170>

2011-09-05 14:22:30-0300 [-] ================================== 15afc413e7ece8f7015df56130a25453e474103c 1315243350.3 startup timeout ==================================

2011-09-05 14:22:30-0300 [-] HTTPB Error 200

2011-09-05 14:22:30-0300 [-] HTTPB Return Error: 200 ->

2011-09-05 14:22:30-0300 [-] 10.1.1.10 - - [05/Sep/2011:17:22:29 +0000] “POST /http-bind HTTP/1.1” 200 105 “http://prisma/nopcommercestore_chatatheneum/Default.aspx” “Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.220 Safari/535.1”

Anyone know what cause this error (HTTPB Error 200)? I reviewed configuration Punjab and Openfire and I searched on web about this error but not found anything that help me…

Please, I need urgent help!

Thank you for your attention,

Gregory

HTTPB Error 200** **

Is there any news concerning this issue?

We are using Openfire 3.8.1 and having apparently the same issue.

We get random connection drops with BOSH.

Our client says “IllegalStateExeption: Not connected to server”

In the Openfire logs there are warnings, e.g.:

org.jivesoftware.openfire.http.HttpBindServlet - Client provided invalid session: 2f2d549. [10.0.49.84]

Any help is appreciated.

Please try 3.8.2 release as there were a number of BOSH improvements included.