NPE during hazelcast init

Lars_Krog-Jensen1 · January 28, 2016, 9:25am

I am trying to upgrade a local OF to version 4.0.1 (and hazelcast 2.2) but during startup I sometime get this NPE:

2016.01.28 09:46:41 INFO  [ClusterManager events dispatcher]: com.hazelcast.partition.InternalPartitionService - [169.254.80.80]:5701 [openfire] [3.5.1] Initializing cluster partition table first arrangement...
2016.01.28 09:46:41 WARN  [ClusterManager events dispatcher]: org.jivesoftware.openfire.cluster.ClusterManager - Null value is not allowed!
java.lang.NullPointerException: Null value is not allowed!
  at com.hazelcast.util.Preconditions.checkNotNull(Preconditions.java:41)
  at com.hazelcast.map.impl.proxy.MapProxySupport.putAllInternal(MapProxySupport.java:862)
  at com.hazelcast.map.impl.proxy.MapProxyImpl.putAll(MapProxyImpl.java:309)
  at org.jivesoftware.openfire.plugin.util.cache.ClusteredCache.putAll(ClusteredCache.java:129)
  at org.jivesoftware.util.cache.CacheFactory.joinedCluster(CacheFactory.java:738)
  at org.jivesoftware.openfire.cluster.ClusterManager$2.run(ClusterManager.java:95)

After some trial-and-error I discovered that this happens only when restarting my OF with active clients on it, the clients reconnects before OF hits this exception and I managed to catch this my debugger:

And there are some values from a cache named ‘Locked Out Accounts’ that has cached some NULL values and is causing the NPE later on.

If all clients are started after OF has initialized the clustering, this exception is not happening.

Regards

Lars

Levi_McPhetridge · May 6, 2016, 3:10pm

I’m seeing this exact issue in version 4.0.2

Iqbal1 · May 18, 2016, 12:46am

Please make sure you have configured all the member servers in your hazelcast-cache-config.xml file.

This file can be found in hazelcast.jar in classes directory. Add the members in this file as below:

...
<join>
     <multicast enabled="false"/>
     <tcp-ip enabled="true">
          <member>of-node-a.example.com:5701</member>
          <member>of-node-a.example.com:5701</member>
          <member>of-node-a.example.com:5701</member>
     </tcp-ip>
     <aws enabled="false"/>
</join>
...

This should solve the NPE and shall make the servers available in cluster.

To make these changes in the hazelcast.jar, you have to unpack it, edit the hazelcast-cache-config.xml file, and again pack it and use it as plugin.

Hope this helps you. Thanks.

Chinmay · August 24, 2016, 5:04pm

Hi Iqbal,

I am also seeing this NPE in my cluster since I upgraded to 4.0.2. However, I don’t understand how configuring the member servers would resolve the NPE. FYI, I am using AWS EC2 hosts, and I think they are configured correctly.

I was thinking, if Hazelcast maps do not allow null values, then shouldn’t the Openfire Cache also disallow putting null values in them?

Thanks,

Chinmay

speedy · August 24, 2016, 5:22pm

I believe this is currently being tracked with https://issues.igniterealtime.org/browse/OF-1156

Chinmay · August 24, 2016, 5:25pm

That’s good to know. Thanks!

Nathan_Neulinger · October 4, 2016, 5:37pm

Is it really necessary to make these changes in the jar itself as opposed to just in the generated ./plugins/hazelcast/classes/hazelcast-cache-config.xml file?

I’m definitely seeing the same issues on my install - overall the hazelcast clustering seems incredibly flakey for restarts/failure events/etc. - and I’m not 100% certain, but it looks like the weirdness in it may also be affecting the behavior of S2S connections.

Nathan_Neulinger · October 4, 2016, 7:40pm

Moving to latest master build seems to have significantly improved things - much more stable for restarts and hazelcast.

Iqbal1 · October 4, 2016, 7:46pm

Chinmay,

The caches do not allow null keys, but they allow the null values. So make sure you do not put null values in the caches.

Cheers,

Iqbal