Bug in IRC -> XMPP encoding/translation

There’s apparently a little bug in the encoding/translation of text in IRC to text in XMPP. While I’m not exactly sure what’s happening here, it seems the transport is taking bold text in IRC and, when converting it, adding � to the beginning and the end of the bolded text. As far as I can tell this isn’t a valid XML entity. XML is only supposed to have 5 built-in entities. If you add any additional entities, you’re supposed to declare them in the DTD. If it’s HTML, there are only 252 entities and you can’t define new ones.

I found this when using mcabber, because something gets borked in its XML parsing and it gets tripped up on this entity. Other clients don’t seem to have the same problem, probably because they’re just ignoring it. Since it’s an invalid entity and not an invalid element, I’d imagine it’s not something that should normally cause a serious error in XML.

That said, if anyone knows what � is supposed to be if it’s a valid entity, what is it? The excerpt from my XML logs looks something like this:

0IN:

This happened after I registered the nick I use via the IRC transport on Freenode. When logging in, you get this message when you’re using a nick that’s registered but you’ve not yet identified. I wouldn’t have noticed it if mcabber hadn’t gotten upset about it. So it’s likely a minor bug, but still seems to be a bug nonetheless.

I found a reference to this:

http://www.oreilly.com/pub/h/1953

It does indeed mean “bold” in IRC land.

GATE-342

Cool beans. Now we know what it is.