Login to admin web UI

Yeah. I’ve done exactly that. No?

_xmpp-client is what you need (and not _xmpps-client).

I have all the three that the page requires. Is that messing it up?
This is what it wants:
image
And this is what our DNS server has:
image
It was a copy and paste setup, so I do not see WTF is OF’s problem, pardon my Italian. Also, seeing how it takes exactly as long for the admin UI to log me in and to open that page after clicking on the above message, it sounds like the admin UI checks DNS every time I log in. There should be a way to satisfy the DNS check algorithm or to disable it if it is insatiable.
Since the OF host’s name is in DNS, I tried to point the SRV records at its name, FQDN, and IP address. No dice.

Please watch your tone. I appreciate that you’re having a hard time, but that’s no excuse for inappropriate behavior.

Remember that everyone here is a volunteer. Those that are helping you do that in their free time. The product that you’re using has been developed and is provided to you free of charge, and without any conditions.

If you’re unable or unwilling to spend the time in maintaining the server, then you can try to reach out to a number of commercial providers for support, but I do not think they’d appreciate this tone of voice either - even if you do pay them.

Where?

Huh? I’m maintaining it 24/7/365, spending time that other areas of my infrastructure just as desperately need, but bloated logs, obscene amount of IP connections, and problems logging in take it away.
Are you willing to provide help? So far you’ve provided the least amount of all. I appreciate every tip that grutto and danc offered. You only asked me questions to which I could not possible have answers. So, right back at ya: watch your tone.

You may have something rouge going on with your install or configuration Have you tried to install a fresh instance on a test/lab server? I installed OF and it works as expected, loads quickly even with ldap integration. Logs are normal (although get a little spam with ldap not being secure).

Hello again,

I know this sounds obvious, but hear me out.

Have you tried restarting the server, to ensure all connections are dropped, does the connection spam return immediately after? Or do they slowly appear over time?

It would be useful to know if this is an issue on startup or if this is an issue when Openfire has ran for a couple of days.

As for your frustration about the logs, Java is known to be very verbose when logging, and the stack traces are logged to allow diagnosis of problems and where they originate from within the code. This does mean that Openfire logs can be huge, however you can periodically wipe these without any issues.

Maybe there is room for improvement when it comes to what log levels are used, especially when dumping huge exceptions stack traces into the logs, but sometimes its best to log too much and have all the info needed to diagnose an issue, than log too little and be scratching your head for hours trying to figure out what is going wrong.

I have checked my install against this issue, I do not see any bindings from localhost ↔ localhost on ephemeral ports, so it is likely an issue with your install, and reinstalling might be the lesser evil here as suggested above.

I assume this is windows, could this potentially be a platform specific issue? ss doesn’t return anything weird on Linux.

I am not able to babysit the server for extended intervals, but yes, hundreds of connections appear very soon after restart. I don’t know how soon but it is within a few hours meaning less than 2-3. This is a server that only 3 of my test IM clients hit, nobody else. So, only a handful of test IMs are being exchanged, no outside world.

It is too late to reinstall. I’ve already beaten my head against the wall with 4.7.3 for long enough to run out of allotted time, having not known that it was a bugged release. I only came here, having exhausted every note left by the admin of the old 4.6 install about config items they had to put, so enough time had been wasted as it stands.

Windows or not, my web, application, DB or SMTP/POP3 servers or whatnot only listen on the specified ports and make no rogue connections to themselves, needless to say not in the 100-s. Apparently, 4.6 had not done it, otherwise there would have been a note to that affect in our ITIL.

Another mystery:

A certificate for the domain of this server is missing. Click [here] to generate a self-signed certificate or [here] to import a signed certificate and its private key.

The ‘domain’ of this server is domain.com. Its host name is mail.
The certificate for domain.com is listed right under that message, in the Identity table.
So what is missing? Does the message actually mean that OF expects a certificate for mail.domain.com hostname?

This IS a fresh install on a test lab to be prod soon.

I’m at a loss. Seems like you may be the only one that is having the issue (or has reported it). It could be unique to your environment. Since its a lab, might blowing it out and try again.

A XMPP server uses a few domain names depending on which services the owner intends to provide to its users.
example.com
proxy.example.com
httpupload.example.com(this one i am not sure about the spelling)
pubsub.example.com
“conference”.example.com(the quotation meaning that this can be customized, however the default is conference)
search.example.com

i think this should cover them all.
i do agree with Speedy. sounds like there is either a problem with your installation or environment.

by the way since you said this:

if your server is exposed to the internet it is probably compromised because that version of Openfire is OLD and there was vulnerabilities there(that are fixed in newer version since the vulnerability was discovered like 2 year ago). so that would explain some of the behavior of your Server.
And if this theory is correct you cant just reinstall your Openfire server. you have to start with a clean not compromised machine.

Could you please check your DNS server settings?

Your log shows a DNS SRV lookup issue with your server address domain.com.

It could be that some DNS records for domain.com are wrongly configured in your DNS server.

  • Does the SRV record point to the right hostname (should be mail.domain.com in your case)?
  • Does the A record for mail exist and does it point to the right IP address?
  • Are there any mismatches between old and new server?
  • In Openfire, have you entered the correct Server Host Name (FQDN) (should be mail.domain.com in your case)?
  • Use this sequence of commands for performing a DNS SRV lookup on your Openfire server:

nslookup
set type=srv
_xmpp-client._tcp.domain.com

This command should return an output similar to this:

_xmpp-client._tcp.domain.com service = 0 0 5222 mail.domain.com.

Are you also looking into setting up S2S federation? If so, check the following SRV records …
_xmpp-server._tcp.domain.com
_xmpp-server._tcp.conference.domain.com
_xmpp-server._tcp.pubsub.domain.com
_xmpp-server._tcp.httpupload.domain.com (not 100% sure, XEP-0363 suggests just upload)
_xmpp-server._tcp.search.domain.com
_xmpp-server._tcp.proxy.domain.com

This returns an output similar to this:

_xmpp-server._tcp.domain.com service = 0 0 5269 mail.domain.com.

I mentioned several times that it is not exposed. It is also not compromised.

Yes to all. I’ve covered that above.

From this ordeal, it is clear that OF dev team misunderstands the way log levels are supposed to be used. This is how it is usually done:

Critical logs messages from errors that prevent the server serving its clients.
Error logs messages from all errors.
Warning logs additional messages considered minor.
Info logs additional messages considered even less important.
Debug, finally, dumps stack traces on top of Critical and Error messages.

Doing otherwise defeats the purpose of setting log levels because now, with Debug = false in the server settings and Info in log4j2.xml, I still am being flooded with stack traces that I do not want to see.

To recap:
Errors from closing connections still should be looked after.
SRV records warning still should be looked after. If nslookup can query them fine, then OF should as well.
Encryption cert/key warning should be looked after.
Seeing how OF complains about items clearly present, I can only think of one reason for those warnings: the actual queries in the code do not match the information shown on the front end.
If I did not paste enough logs here you should let me know and I’ll provide, but I have neither knowledge nor time to investigate all of that on my own. I am not able to go into every code base that I have to admin.

I notice this thread has been noisy, I will respond to it all in a single post to prevent it getting anymore noisy.

I am not able to babysit the server for extended intervals, but yes, hundreds of connections appear very soon after restart. I don’t know how soon but it is within a few hours meaning less than 2-3. This is a server that only 3 of my test IM clients hit, nobody else. So, only a handful of test IMs are being exchanged, no outside world.

Unfortunately this is a lot of what sysadmin is… you got to have the time to sit there and watch what is going on, and monitor the server, if you can’t do this then maybe you need to consider outsourcing the hosting to someone else who can monitor the server, servers aren’t just setup and leave, they must be maintained.

It is too late to reinstall. I’ve already beaten my head against the wall with 4.7.3 for long enough to run out of allotted time, having not known that it was a bugged release. I only came here, having exhausted every note left by the admin of the old 4.6 install about config items they had to put, so enough time had been wasted as it stands.

I haven’t got experience with the older versions, however is there any reason you couldn’t give 4.8.1 a shot? 4.8.x includes the netty rewrite which heavily modifies how Openfire handles network connections, and see as you are having issues with network connections, this might be a good idea. I understand it is time consuming, but there is not many alternatives.

Windows or not, my web, application, DB or SMTP/POP3 servers or whatnot only listen on the specified ports and make no rogue connections to themselves, needless to say not in the 100-s. Apparently, 4.6 had not done it, otherwise there would have been a note to that affect in our ITIL.

Wait, are these applications on the same server? Could you unplug the network to the server, and kill all services using the server and see if it is actually Openfire making these connections to themself, or some other program on the same server.

The ‘domain’ of this server is domain.com. Its host name is mail.
The certificate for domain.com is listed right under that message, in the Identity table.
So what is missing? Does the message actually mean that OF expects a certificate for mail.domain.com hostname?

This warning appears when there is one or more missing FQDNs within the certificate, for whatever reason Openfire requires all the FQDNs to be within a single certificate, therefore requiring the use of SAN certificates, or a wildcard certificate (if you don’t want to mess about, maybe just stick a wildcard certificate in, and call it a day).

What FQDNs you are missing is dependent on your configuration.

This IS a fresh install on a test lab to be prod soon.

Possible issue with your configuration? by “fresh” you mean the very base installation, no further configuration? all default values (apart from the domain name etc).

From this ordeal, it is clear that OF dev team misunderstands the way log levels are supposed to be used. This is how it is usually done:
Critical logs messages from errors that prevent the server serving its clients.
Error logs messages from all errors.
Warning logs additional messages considered minor.
Info logs additional messages considered even less important.
Debug, finally, dumps stack traces on top of Critical and Error messages.

Openfire is open source, if there is an issue you are free to patch it, the codebase is old and the logging decision was made decades ago, going through the codebase and changing what messages classify as what log level is a lengthy process, and none of the developers are paid to do so. The logging works, even if it is possibly too verbose, there is bigger fish to fry.

Secondly, nobody here is paid to support you, they have no obligation to help, continuing being aggressive and flinging mud will not yield a solution, and will likely result in your messages simply being ignored.

To recap:
Errors from closing connections still should be looked after.
SRV records warning still should be looked after. If nslookup can query them fine, then OF should as well.
Encryption cert/key warning should be looked after.
Seeing how OF complains about items clearly present, I can only think of one reason for those warnings: the actual queries in the code do not match the information shown on the front end.
If I did not paste enough logs here you should let me know and I’ll provide, but I have neither knowledge nor time to investigate all of that on my own. I am not able to go into every code base that I have to admin.

A recap is not required, people are more than able to scroll through the thread history and see what has been attempted, and what the results of said attempt were.

Also please stop making the logging your enemy, the logging isn’t what is causing your issues and you are making a huge deal out of a small problem.

Please consider some of the suggestions above.

They suggest things that are already covered in this long thread, so the recap is necessary.

It is one of them.

They’ve been considered prior to my posting here.
I’ve set up SRV records by copying and pasting the items required by OF admin UI, yet it is still not satisfied. I only do that it requires from me and report back that which it shows me, whereas suggestions above are for new, different items.

I never have. I only want to control it as it is offered by OF specifications, and me having worked with softwares that log, for 3 long decades, I know when one logs enough or more than enough. OF logs profusely and way more than reasonable, and that cannot be controlled. It is totally up to the dev team whether they accept this as valid feedback or push back and oppose that - I’ve done my part and raised this issue. Their call.

The firewall rule is not addressed/looked after. The solution was found elsewhere.
The certificate message is not addressed looked after.
The error closing connections is not addressed/looked after.
If no one is interested or willing to help then just say so, and I will stop wasting my and your time. We are all people, not robots, and I understand ‘yes’ and ‘no’.

Many have tried to help and are willing too. Yes the logs can be a little busy. We all agree on that. But thats not a blocker by any means.
Again, given that no one has been able to reproduce your issue, it suggest something unique to your setup.

This thread has gotten off topic, and is not currently productive. I’m going to close it. Please feel free to start a new one if you feel like its warranted.

1 Like