An Unofficial Q&A about the Discontinuation of the XHTML2 WG
So the W3C finally announced
that the XHTML2 WG will be taken off life support at the end of 2009.
I’m annoyed that Zeldman used the F-laden TLA “WTF”
instead of “AFT”
in title of his post
about the announcement. Moreover, many of the comments on Zeldman’s
post indicate that there are people who are badly misinformed about
the matters surrounding this announcement. To help remedy that,
here’s some quick Q&A for getting informed.
- First off, is there anything to disclose?
-
I’ve been working on HTML5—the main competition of XHTML2—for
a long time. I get paid for it. I wrote this Q&A on my own time
and it wasn’t vetted by anyone prior to publication. (And yes,
this question is copied from Steve
O’Grady.)
-
What was announced?
-
The W3C management has decided
to allow the charter of the XHTML2 Working Group to expire at the
end of 2009 and not to renew it.
-
What’s XHTML?
-
There are two meanings to XHTML: technical and marketing. The technical
kind (XHTML served using the
application/xhtml+xml
MIME
type) is a formulation of HTML as an XML vocabulary. The marketing
kind (XHTML served using the text/html
MIME type) is
processed just like HTML by browsers but the authors attempt to
observe slightly different syntax rules in order to make it seem
that they are doing something newer and shinier compared to HTML. (Note added 2009-07-07: I apologize to authors who are using XHTML and got offended. The “in order to” part of the previous sentence was meant as a jibe at gurus who used XHTML as a marketing platform but gave pseudo-technical reasons e.g. about parsing and mobile clients—not at people who listened to them.) -
What’s XHTML2?
-
XHTML2 was a new language similar to XHTML but incompatible with it.
It wasn’t an XML formulation of any HTML spec.
-
Was XHTML2 being implemented?
-
Not in any of the top 5 browsers. There was an experimental
implementation of an old draft years ago in a research browser.
-
What was the XHTML2 WG working on?
-
The XHTML2 WG was working on several
things:
-
The XHTML2 language
-
Editorial revisions to previously published XHTML 1.0, XHTML 1.1,
XHTML Basic, XHTML Print and XHTML Modularization specifications
-
XSD schemas for XHTML 1.1, XHTML Basic and XHTML Print
-
RDFa
-
A specification for an attribute called
role
-
XML Events (though it seems the WG hasn’t actively worked on it
recently)
-
XFrames (though it seems the WG hasn’t actively worked on it
recently)
- Did the W3C kill XHTML2?
-
No, XHTML2 was already dead for all practical purposes due to its
failure to be backwards compatible and its failure to deliver
compelling new features. The W3C just announced they will take it
off life support.
-
What happens to XHTML 1.x specs?
-
If the XHTML2 WG completes its editorial revisions by the end of the
year, it’s possible that they publish editorial revisions as new
Editions of the previous Recommendations.
-
What happens to RDFa?
-
RDFa (in XHTML but
not in HTML!) is a W3C Recommendation and, as such, doesn’t need
any terminating action per the W3C Process. It’s unclear if
another WG will develop further RDFa specs.
-
What happens to the
role
attribute? -
Most likely an ARIA-only
CURIEless incarnation of
the
role
attribute will find its way into HTML
5 as the ARIA specs mature. -
What happens to the other XHTML2 WG deliverables?
-
According to the
FAQ published by the W3C, the XML events spec will likely end up
in the Forms Working Group (the group working on XForms), Access
modules will likely end up in the HTML WG and the remaining
deliverables will end up as Working Group Notes. Personally, I doubt
that the Access module will be supported by consensus at the HTML
WG.
-
What’s the HTML WG?
-
The HTML WG is another W3C
working group that is working on HTML 5 together with the WHATWG.
-
What’s the WHATWG?
-
The WHATWG is a group that individuals from Apple, Mozilla and Opera
founded outside the W3C in order to evolve HTML when the W3C told
them that work wasn’t
welcome within the W3C. Later, the W3C changed its mind, renamed
the previous HTML WG into XHTML2 WG and formed a new HTML WG.
-
If the remaining deliverables of the XHTML2 WG are going to be
published as Notes, does it mean the W3C endorses them after all?
-
No. The W3C
Process doesn’t allow a document to be simply abandoned once
it has been published as a First Public Working Draft. The documents
have to end up as either Recommendations or Notes. Groups that are
still within charter can stall their abandoned deliverables
indefinitely, but the upcoming expiration of the XHTML2 WG charter
will force the adherence to the W3C Process on this point.
-
Is the W3C dropping work on XHTML?
-
No. The HTML WG is defining XHTML5 which is an XML serialization for
HTML5.
-
I’ve published Web pages using XHTML 1.0 or XHTML 1.1. Do I need
to rewrite them now?
-
No. They will continue to function as before.
-
What’s the upgrade path from XHTML 1.x?
-
For the technical kind of XHTML 1.x—that is, XHTML served as
application/xhtml+xml
—the upgrade path is to XHTML5.
For the marketing kind of XHTML 1.x—that is, XHTML served as
text/html
—the upgrade path is to HTML5. Moreover,
“HTML5” replaces “XHTML” (and “Ajax”!) as the coolest
marketing buzzword. -
What’s HTML5?
-
HTML5 is a new level of the Web’s most significant markup
language. New features provide better support for Web applications,
for video and audio and for expressing document structure. This
language is defined in a specification called HTML
5. “HTML5” is also used as a marketing buzzword for
all the new cool features in the browser platform—even for
features that have never been in the HTML 5 spec or
that have been spun off it.
-
Video? Wasn’t video removed from HTML5 recently?
-
No. That’s a bogus rumor. (What was removed was some placeholder
text about codecs.)
-
Is HTML5 being implemented?
-
Yes. Firefox, Opera, Safari, Chrome and IE implement bits and pieces
of HTML5—even more so in nightly builds than in releases. The
future is already here. It just isn’t evenly
distributed yet.
-
If I upgrade from XHTML-served-as-
text/html
to HTML5,
do I need to revise all my empty tags? -
No. HTML5 permits both the XHTML-style syntax (
<br/>
)
and the HTML 4-style syntax (<br>
) for void
elements (elements that never take any content). -
Is XHTML5 more semantic than HTML5?
-
No.
-
Can I serve XHTML5 as
text/html
? -
You can’t. HTML5 and XHTML5 are defined in terms of MIME type, so
text/html
isn’t XHTML5 by definition. -
Can a document be both HTML5 and XHTML5 if I serve it as
text/html
to IE and as application/xhtml+xml
to other browsers? -
It is possible to construct documents that are valid HTML5 when
labeled as
text/html
and valid XHTML5 when labeled as
application/xhtml+xml
. Doing so is much harder than it
first appears and is most often useless, so you’d probably spend
your time better by not trying. -
Can HTML5 be validated?
-
Yes. With an HTML5 validator.
-
Which one should I use: HTML5 or XHTML5?
-
In most cases, the answer is HTML5. XHTML5 doesn’t work in IE.
(Just like technical XHTML has never worked in IE. Only the
marketing kind of XHTML has worked in IE.)
-
What if I want to include SVG or MathML inline?
-
You will be able to use SVG and MathML inline in
text/html
once browsers upgrade their parsers. You can test this today by
downloading a nightly
build of Firefox, going to about:config
and
flipping the preference html5.enable
to true
.
(Demo
page.) However, for the time being, to use inline SVG or MathML
with released versions of Firefox, Opera, Safari or Chrome you need
to use application/xhtml+xml
instead. -
What’s the doctype for HTML5 documents?
-
Simply:
<!DOCTYPE html>
-
What’s the doctype for XHTML5 documents?
-
application/xhtml+xml
documents don’t need a doctype.
XHTML5 can use any doctype (or none), because any other requirement
would reach onto the XML layer and violate the clean layering of
XHTML5 and XML. For simplicity, I suggest you use no doctype for
XHTML5. (Yes, the XHTML 1.0 specification violates clean layering.) -
If I can use any doctype for XHTML5, how can browsers tell XHTML 1.0
and XHTML5 apart?
-
They can’t and they don’t need to. By design, a user agent that
implements XHTML5 will process inputs authored as XHTML 1.0
appropriately.
-
I’m using XML tools to consume content. What do I do with HTML5?
-
When your application receives content labeled as
application/xhtml+xml
, instantiate an XML parser. When
your application receives content labeled as text/html
,
instantiate an HTML5 parser. There are now off-the-shelf HTML5
parsers (such as the Validator.nu
HTML Parser) that expose an XML API so your application sees an
infoset that looks just like the infoset from an XML parser parsing
the equivalent XHTML5 document. -
I’m using XML tools to generate content. What should I do?
-
If you don’t care about IE, you can use an XML serializer and
serialize to XHTML5 (
application/xhtml+xml
). However,
if you do care about IE, you can use an HTML5 serializer and
serialize from an XML pipeline to text/html
. In this
case, you must avoid constructs that aren’t supported in text/html
(e.g. div
as a child of p
). -
But XSLT and XPath don’t work with HTML!
-
Incorrect. As mentioned above, HTML5 parsers expose an infoset
equivalent to an XML parser parsing XHTML5. The Validator.nu
HTML Parser comes with a sample application for using the JDK XSLT engine with HTML5 inputs.
-
Do semantics round-trip in an HTML5 to XHTML5 to HTML5 conversion?
-
Yes, provided that the first HTML5 input is valid and you don’t
ascribe semantics to characters that aren’t allowed in XML (such
as form feed or U+FFFF). Note that RDFa isn’t valid in either
HTML5 or XHTML5 as currently drafted.
-
What about XHTML5 to HTML5 to XHTML5?
-
Not if namespace-based extensibility is used. However, in the common case, the conversion chain does round trip if the input is valid XHTML5 + SVG 1.1 + MathML 2.0 (this excludes RDFa), doesn’t use namespaces from outside those specs (It’s debatable if the previous condition already covers this.),
xml:space
on HTML elements is not considered to affect semantics and relative URLs are rewritten so that xml:base
attributes can be removed without breaking links. (Answer clarified/corrected 2009-07-07.) -
What’s the namespace for HTML5?
-
HTML elements are in the
http://www.w3.org/1999/xhtml
namespace. You don’t need to declare this namespace in text/html
.
An HTML5 parser puts HTML elements in the namespace automatically. -
So does HTML5 support namespaces?
-
There’s no syntax for declaring namespaces in
text/html
.
Syntax that looks like a namespace declaration has no effect.
However, the HTML5 parsing algorithm automatically assigns stuff to
namespaces appropriately. -
Does XHTML5 support namespaces?
-
Yes. XHTML5 is layered on top of XML plus Namespaces.
-
Are the semantics of HTML5 extensible?
-
Yes. With microdata.
-
Is it true that HTML5 has fewer accessibility features than XHTML 1.x?
-
No. HTML5 has a larger number of accessibility features, but it
isn’t obvious that they are accessibility features, because by
design they haven’t been designed solely for accessibility but
provide opportunities for enhanced accessibility as a side effect of
something else.
-
Will Zeldman now just do
s/XHTML/HTML5/
in all his
books and republish? -
No. He
says that conjecture is “Wrong as prohibition”.