Invalid xml characters in incoming xml stream

Guys:

I’m using Smack api with a Java client to subscribe and receive xmpp data from a xmpp service(pubsub node),

I’m using:

import org.jivesoftware.smack.*;

import org.jivesoftware.smackx.pubsub.*;

It’s working fine for about 90% of the messages but some incoming xml data are not wellformed, sometimes they include invalid xml characters like the letter “&”. The service provider (sending the data) claims that the messages run through a xml validator before getting to the stream. and that the service would choke on invalid xml data before I receive it.

No I’m trying to figure out the cause of the issue, is it caused by the service provider or is it somewhere in the pipeline before I receive the message, or is it caused by the smack api when receiving the message.

Are you guys aware of any similar issues with smack?

Other way is to fix incoming xml, I can try to use regular expressions to replace these invalid xml nonencoded characters but I’m sure I may replace characters that should not be touched, and I may miss some other characters, not a solid solution.

let me know if you have any suggestions…

thanks!