about:blank
about:blank
is probably the hardest Web page to load. In fact, it is so hard that
in order to turn the HTML5 parser on by default in Firefox last year,
we decided to special-case about:blank
to use the old parser in Firefox 4.
In Firefox, about:blank
is sometimes
parsed from a stream and sometimes its DOM is generated without
running a parser. The problem that prevented us from using the HTML5
parser for about:blank
is that a bunch
of test cases assume that when about:blank
is parsed from a stream, the whole operation happens as a single
event loop task. These tests aren’t really testing about:blank
behavior, but since the test cases are Gecko-specific they inevitably
have accidental dependencies on delicate Geckoisms that real Web
pages wouldn’t depend on.
The HTML5 parser parses streams off the main thread, so getting
anything parsed involves at least two event loop tasks on the main
thread and a spin in between. First a task for setup and later
another task for handling the data that the parser thread handed back
to the main thread. This is OK for parsing streams from the network,
because data from the network takes multiple event loop spins to
arrive anyway. However, it was a major problem with data:
URLs in test cases and it still is a problem with about:blank
in test cases.
We want to remove the old HTML parser from the code base entirely
after Firefox 4, so special-casing about:blank
to use the old parser is not a reasonable long-term solution. Since
Gecko’s behavior differs in subtle ways from other browsers, it’s
probably a bad idea to implement a special pseudo-parser for
replicating the old Gecko behavior exactly. Instead, it would make
sense to see what other browsers are doing, standardize the least
bizarre but still Web-compatible behavior and implement that.
Unfortunately, it’s not clear what the least bizarre but still
Web-compatible behavior is. It seems that IE has had special behavior
for about:blank
in the window.open
case practically forever while IE’s iframe
behavior is refreshingly reasonable. Other browsers appear have
generalized the special behavior to apply to a all browsing contexts.
However, Gecko has done it differently from the others. WebKit used to
handle things more like iframe
s in IE
but has moved to a more complex direction.
I tested the behavior of loading about:blank
on one hand and another (same-origin) URL on the other hand into an
iframe
and into a window.open
-created
browsing context (with pop-up blocker turned off). Since I was aware
of significant implementation differences between the traditional
window.open
case that opens a window and
browser prefs that target window.open
into a new tab, I tested both cases.
I identified eight different behavior. For ease of presentation, I first list the behaviors and then show a table of which browser has which behavior in which case.
about:blank
document is created
synchronously into the browsing. A load event is fired for it
synchronously (or not observable in the window.open
case).about:blank
document is created
synchronously into the browsing context. A load event is not
fired for it. A task is queued for loading another about:blank
document into the browsing context. This second document has its DOM
built during the one task, so no bodyless state of the DOM is
observable. A load event fires for this second about:blank
.about:blank
document is created
synchronously into the browsing context. A load event is not
fired for it. Later task queue tasks incrementally build the DOM
for the non-about:blank
document into
the browsing context as data arrives from the network. A load event
fires for this second document.about:blank
document is created
synchronously into the browsing context. A load event is not
fired for it. Later task queue tasks incrementally build the DOM
for the non-about:blank
document into
the browsing context as data arrives from the network. No
load event fires for this second document in a way observable
from the outside.
window.open
returns, the browsing context already has the DOM of the target
document in it.window.open
returns, the browsing context already has the DOM of the target
document in it. No load event fires.
iframe
about:blank
|
iframe
other URL
|
window.open
about:blank into tab
|
window.open
other URL into tab
|
window.open
about:blank into window
|
window.open
other into URL window
|
|
---|---|---|---|---|---|---|
Firefox 4 | Sync plus single-task | Sync plus async other | Sync plus single-task | Sync plus async other | Sync plus single-task | Sync plus async other |
Chrome 10 | Sync | Sync plus async other | Sync | Sync plus async other | ||
Safari 5 | Sync | Empty plus async | Sync | Empty plus async | Sync | Empty plus async |
Opera 11 | Sync | Sync plus async other | Sync | Sync plus async other no load | Sync | Sync plus async other no load |
IE6 | Empty plus async | Empty plus async | Racy | Empty plus async | ||
IE9 | Empty plus async | Empty plus async | Sync | Cache-dependent premature load | Racy | Cache-dependent no load |
I hope this research can be used for assessing whether the current HTML5 draft makes sense on this topic. Even though the behavior “Sync” seems to be prevalent in the table, I’m rather worried about ever making the load event fire synchronously, so I’m hoping we don’t end up having to do that.
My own judgment is leaning towards the following:
When loading any URL in a window.open
-created browsing context,
initially create the DOM of about:blank
there synchronously, since all browsers
seem to do roughly that.
Since synchronous events are bad and Firefox gets away with not firing
load
synchronously for about:blank
, never fire the
load
event synchronously.
Fire a load
event asynchronously for the document that is the
destination of the navigation.
If about:blank
is the destination of navigation for
window.open
, use the initial synchronously-created DOM without
overwriting it.
If about:blank
is the destination of navigation for an
iframe
create the DOM synchronously ASAP as with
window.open
.
I am not sure what to think of the case where about:blank
is
not the initial destination of navigation for an
iframe
. Creating a synchronous placeholder
about:blank
DOM would make the browsing context code the same as
in the window.open
case and would provide an early
document.body
that scripts could accidentally poke without poking
null
. On the other hand, setting up such a DOM first and then
letting the DOM be blown away by a later network task means that the presence
of element nodes in the DOM would be racy relative to the network and to
script
getting appended to the DOM in the same parser task that
appended the iframe
. In that sense, the IE/Safari behavior would
be nicer.