Josh (the blog)

I’ve delivered simple, clear and easy-to-use services for 20 years, for startups, scaleups and government. I write about the nerdy bits here.


@joahua

WordPress comments and numeric entity codes

I received an email from Matt Thommes (matthom) today, regarding his comments on a post not showing up as he’d commented a few days back. I thought that was odd, because I don’t think I’ve ever moderated one of his comments — they’re perpetually relevant and on topic, or just said in good fun with taste –, and people who have posted here previously should be automatically authenticated and allowed to post.

His problem arose, he says, when trying to include certain character entities, or when posting twice on the same topic. I think I’ve misproved the twice-on-same-comment thing, but the entity concern is valid — when using numeric entity codes. (Those are the ones that take the form &#xxxx; where xxxx is a number)

There’s not much documentation on this, but it would appear it’s an inherent WordPress anti-spam procedure, lest spammers encode their entire message in this way (that would have negligible SEO benefit, but carefully crafted messages can entice users, and the ease of distribution means that the chance of someone clicking through makes it worthwhile, for them) and thus avoid detection.

In fact, the only near-official word I could find on the matter was this comment on Matt Mullenweg’s (WP lead developer) weblog, in which he states:

I do block comments with numeric entities lower than a certain number.

Whether or not this holds true for the WordPress platform as a whole, I can’t say conclusively — though it seems that’s the symptom, here.

In Matthom’s case, the concern was marking up an HTML tag for display in a comment — using entity codes < (<) and > (>).

I usually use < and > for this purpose, so I hadn’t noticed the problem until now. That works fine, but I think WordPress deleting this outright is a little extreme… Not entirely sure what the problem is here.