So far, I haven’t had an Atom feed for this site. My excuse has been that I would add an Atom feed only when the spec reaches the RFC status. With the publication of RFC 4287, I (like Anne van Kesteren and Arve Bersvendsen) have reached the milestone of having my name in an RFC. So now I had to get coding. Herewith some notes.
My site is not database-driven. I have a cron job that runs a Jython script that generates my RSS and (now) Atom feeds by scraping the site.
Like last time, I avoided cargo cult programming and based my implementation on the spec. I did not view the source of someone else’s feed.
Adding Atom support was not hard as such. I spent the most time wrangling the
CLASSPATH (as usual) and tracking down a couple of bugs in my Java code.
java.lang.NoClassDefFoundError is the single most unpleasant thing about Java.
One of the bugs involved using
!= where I should have used
The other bug was expecting non-existent attributes to appear as
I got carried away with implementing proper transfer of language metadata from the site into the feed. Creeping elegance.
The feed contains the full content of the entries published within the last 48 hours at the time the feed is generated and at least the latest entry. However, the feed is regenerated only when the modification date of at least one entry changes.
TagSoup rc1 does not map
xml:lang. And it appears that there is now rc3.
Sometimes the RELAX NG productions in the spec are easier to read the than the prose.
All my text constructs use XHTML. TagSoup—not the infamous Appendix C—future-proofs my content.
I still don’t like the namespace div, and I feel guilty, because my remarks caused the issue to be revisited, which led to the namespace div requirement.
The Feed Validator warns about my title Tag Soup: How Mac IE 5 and Safari handle <x> <y> </x> </y>. One of the major reasons why Atom exists was to make titles like that unambiguous. (Although spec-wise, the title is unambigous in RSS 0.91 as well.)
I have Pilgrimesque future feature idea of adding namespace prefixes to the XHTML elements just to be able to reveal aggregators that do not process namespaces properly.
The feed is guaranteed to be well-formed.
Here is the feed.