Possible resource leak?

perhaps SwingWorkder or java.util.concurrent.ScheduledExecutorService may help? With the executor, you can set an interval to execute the runnable…

do we want things running on the main spark swing ui thread? won’t this cause blockage?

Sorry, I meant event-dispatch thread. SwingWorker is a good choice for executing scheduled tasks but it is usually recommended to be used for long running tasks. For UI things that you want executed after a delay or repeteadly javax.swing.Timer is recommended.

We are using SwingWorker in spark, but for some UI things we were using java.util.Timer - I changed this to use javax.swing.Timer.

Some documentation:

SwingWorker: http://docs.oracle.com/javase/tutorial/uiswing/concurrency/worker.html

javax.swing.Timer: http://docs.oracle.com/javase/tutorial/uiswing/misc/timer.html

for anything updating the UI, I’d say you’re right, the swing concurrency/thread classes are the right choice to avoid having to have invokeLater’s all over the place. otherwise, if it’s a non-ui related thing, use the java.util.concurrency.* classes. - can’t believe there was some old-school java.util.Timers updaing UI elements!

Just an update. I have more an more reports of Spark 618 build locking with black or just unresponsive windows. Weird, as we have tested this build for some time without issues. And after the deployment it has locked even on me once Haven’t seen Spark freezes for many years… I have updated myself and some other affected users to 624 to see if it helps. Well i had today a lockup, but… i had unplugged my LAN cable and then plugged it in. Some other apps also locked, so maybe it wasn’t just Spark. If this issue persist we will have to rollback to 610.

looks like 618 was mid november build? it’s sort of sounding like a blocked thread or something…

Yes. Though first memory leak fixes by Mircea was in 616 (we just wanted a few other fixes done in 617 and 618 included).

So, today i had to go to a meeting, i disconnected my laptop from a station (so LAN was down) and went to a conf.room. Opened the lid. Every app was ok, but Spark once again locked up. Then i have installed 610 build again and after disconnecting from a network and going to sleep and wake up it works fine.

Thanks wroot. I tested your scenario and I reproduce spark lock issue with build 624. I will investigate and provide feedback

Can you please attach logs. I cannot reproduce all the time. I unplugged network cable many times and spark locked only once (build 624). Thanks

There is no today’s logs, only yesterday and i’m not sure they are related:

Nov 20, 2013 1:58:54 PM org.jivesoftware.spark.util.log.Log error

SEVERE:

not-acceptable(406)

at org.jivesoftware.smack.Roster.removeEntry(Roster.java:332)

at org.jivesoftware.spark.ui.ContactList$26$2.run(ContactList.java:1857)

at java.awt.event.InvocationEvent.dispatch(Unknown Source)

at java.awt.EventQueue.dispatchEventImpl(Unknown Source)

at java.awt.EventQueue.access$000(Unknown Source)

at java.awt.EventQueue$3.run(Unknown Source)

at java.awt.EventQueue$3.run(Unknown Source)

at java.security.AccessController.doPrivileged(Native Method)

at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)

at java.awt.EventQueue.dispatchEvent(Unknown Source)

at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)

at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)

at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)

at java.awt.EventDispatchThread.pumpEvents(Unknown Source)

at java.awt.EventDispatchThread.pumpEvents(Unknown Source)

at java.awt.EventDispatchThread.run(Unknown Source)

Dec 02, 2013 9:17:18 AM org.jivesoftware.spark.util.log.Log error

SEVERE:

not-acceptable(406)

at org.jivesoftware.smack.Roster.removeEntry(Roster.java:332)

at org.jivesoftware.spark.ui.ContactList$26$2.run(ContactList.java:1860)

at java.awt.event.InvocationEvent.dispatch(Unknown Source)

at java.awt.EventQueue.dispatchEventImpl(Unknown Source)

at java.awt.EventQueue.access$000(Unknown Source)

at java.awt.EventQueue$3.run(Unknown Source)

at java.awt.EventQueue$3.run(Unknown Source)

at java.security.AccessController.doPrivileged(Native Method)

at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)

at java.awt.EventQueue.dispatchEvent(Unknown Source)

at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)

at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)

at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)

at java.awt.EventDispatchThread.pumpEvents(Unknown Source)

at java.awt.EventDispatchThread.pumpEvents(Unknown Source)

at java.awt.EventDispatchThread.run(Unknown Source)

Dec 02, 2013 9:17:18 AM org.jivesoftware.spark.util.log.Log error

SEVERE:

not-acceptable(406)

at org.jivesoftware.smack.Roster.removeEntry(Roster.java:332)

at org.jivesoftware.spark.ui.ContactList$26$3.run(ContactList.java:1898)

at java.awt.event.InvocationEvent.dispatch(Unknown Source)

at java.awt.EventQueue.dispatchEventImpl(Unknown Source)

at java.awt.EventQueue.access$000(Unknown Source)

at java.awt.EventQueue$3.run(Unknown Source)

at java.awt.EventQueue$3.run(Unknown Source)

at java.security.AccessController.doPrivileged(Native Method)

at java.security.ProtectionDomain$1.doIntersectionPrivilege(Unknown Source)

at java.awt.EventQueue.dispatchEvent(Unknown Source)

at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)

at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)

at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)

at java.awt.EventDispatchThread.pumpEvents(Unknown Source)

at java.awt.EventDispatchThread.pumpEvents(Unknown Source)

at java.awt.EventDispatchThread.run(Unknown Source)

One more thing. It looks one proven way to reproduce this is let PC go to sleep, then wake it. Hangs for my users all the time in this scenario.

looks like a threading issue with org.jivesoftware.spark.ui.ContactList. Mircea may have discovered one of the hidden landmines buried in spark

@Mircea

Hmm… looking at some of the recent commits in ContactList.java – I think we’re ok with SwingUtilities.invokeLater() since it internally calls out to EventQueue.invokeLater(), so they are the same basically.

Wondering if there is a timing problem here? The code that was replaced seemed to wait some seconds before it ran it’s routine, but the invokeLater() i think just runs when the queue gets to it… so no guaruntee of time passing after the object is passed into the queue, and when it actually gets executed. Perhaps there’s not much in the queue at that point, and the code is getting run too quickly upon sleep resume?

I think that may be why the Thread.sleep() was tried? If we need to delay execution, maybe ScheduledExecutorService would help?

Jason, thanks for looking into this.

Actually I removed that approach, and I replaced with javax.swing.Timer. Originally java.util.Timer was used to introduce a 0.3 seconds delay but that was a mistake when UI change is performed because java.util.Timer creates new threads and timer tasks are accumulating because that code is repeatedly called (inside ContactList.moveToOffline routine) . UI objects may remain linked in there and never garbage collected.

The reason for the delay is that In UI the user contact item gets red colour for 0.3 seconds before moving to offline group. In the current codebase (build 624) I just used javax.swingTimer instead of java.util.Timer (no Thread.sleep calls anymore) and kept the 0.3 seconds delay. So there is no change in functionality, just that javax.swing.Timer is recommended because it does not create a new thread, it executes the code in the swing event-dispatch thread (just like invokeLater)

I understood from wroot that also the build 624 is locking when network goes off/on. I tested all day the scenario and I reproduced it only once.

wroot, thanks for help

I was able to reproduce othe lock only once and I tested all day the scenario.

You are saying that this is reproduced regularly witn build 624 when network goes off/on.

Are you testing on igniterealtime.org server? Do you have a test account that I can use to see if I can reproduce the locking regularly…thanks

It happened only twice to me with the network disconnect. I can’t reproduce this when i’m trying, it only happenes in real life usage… What about sleep and wake? Our new machines with Windows 7 have very low sleep time setting (30 minutes on AC). So for some users their PCs are going to sleep very often and they find freezed Spark every time they come back. I have my sleep time increased to few hours, so i’ve never experienced this myslef. Also i think thei PCs are losing network when going to sleep.

I’m not using igniterealtime server and i can’t provide a test account. We use Openfire 3.8.2.

Same here, we use Openfire 3.8.2

I would like to identify the commit that is causing this and revert it.

So we know that build 618 is locking. Could you ask one of your users to test with build 616 to see if there is anything different? Thank you so much for your help. I will make 30 minutes sleep on my windows 7 laptop and I will try to reproduce. Thanks

I’m reverting my users to 610 for now. Myself testing with 616 and 30 minutes sleep time. Though i have noticed that Spark shows black window for a second even bringing up a chat window which was minimized for some time. I was using 610 for a few days and haven’t noticed this. So i’m not 100% sure, but looks like 616 has some issues too.

hmm, i’m unable to replicate this myself. i’ve tried profiling it and i can’t see any runaway threads or anything that would obviously cause this. but then again, my home dev setup is not great for testing this, since i only have myself in my contact list and no groups, etc. so not a lot of actual activity… (i’m still trying to get time to build a new in-house spark so i can tset it here).

Wroot, can you list the plugins you are using? Number of groups, approx. number of contacts, etc?

Default plugins that come with Spark, but i have some of them disabled (like OTR, Translator, Spellchecker, trasfer guard and sip phone probably too, will have to check on Monday to be sure). Around 15 groups. 3 of them have about 50 users each, rest have 3-6 users. About 200 users total, ~150 online daily.