Who knows prefixed XHTML from a hole in the ground?

In a post called Who knows an XML document from a hole in the ground? Aristotle Pagaltzis wrote about problems with the namespace handling in aggregators that claim to support Atom 1.0. He concludes that NetNewsWire is a “working aggregator” and later in a follow-up that NetNewsWire is “conformant”. Unfortunately, it is not that simple.

The XHTML elements appear unprefixed in the test cases that were used to draw the conclusion. Therefore, the test cases do not reveal whether an aggregator is able to properly handle XHTML elements with a namespace prefix. After all, both reserializing the content in a namespace-unaware way and grabbing the raw markup from underneath the XML parser would appear to work when the unprefixed markup was pasted in a text/html template.

Earlier this week I upgraded fiMUG’s Atom feed to Atom 1.0. I took the minimal steps that were necessary in order to make the code produce Atom 1.0 instead of 0.3. The feed binds the XHTML namespace to a prefix. All the XHTML elements have colonified names.

I loaded the new feed into NetNewsWire Lite 2.0.1. Like previously with prefixed element names in Atom 0.3 in NetNewsWire Lite, the text was visible but the links did not show up as links nor did paragraph breaks appear.

So what's going on here? It appears that NetNewsWire sticks the XHTML content in a text/html template without removing the namespace prefixes. There are two problems here:

  1. The content is not reserialized in a namespace-aware way.
  2. The XHTML content is rendered through WebKit using WebKit’s text/html code path instead of using the XML code path.

The first is an obvious flaw. The latter is short-sighted considering that WebKit is expected to support embedded SVG on the XML side in the future and Web Applications 1.0 will allow XHTML constructs that will be incompatible with text/html parsers.

Update: There is a modified version of Aristotle’s test case with the XHTML namespace bound to a prefix. Here’s another test case that also has some FooML markup which should not be treated as XHTML.