Henri Sivonen’s pages


Activating Browser Modes with Doctype
A document about the essentials of the layout modes of newer browsers. (Una vecchia versione disponibile in italiano.)
HOWTO Avoid Being Called a Bozo When Producing XML
Dos and don’ts about producing XML programmatically.
The Sad Story of PNG Gamma “Correction”
Why you might not want to use PNG images when you want image colors and CSS colors to match.
An HTML5 Conformance Checker
My master’s thesis
Assembling Web Pages Using Document Trees
A paper about a template engine that operates on XML document trees. (Source code available.)
Tag Soup: How Mac IE 5 and Safari handle <x> <y> </x> </y>
What happens with the DOM in Safari and Mac IE 5 when the nesting of the markup is broken?
Thoughts About a Print UI for Mozilla
Some thoughts about printing from a Web browser.
Digitaalisesta arkistoinnista
Documents about archiving digital documents (in Finnish)
Can Anti-DRM Clauses in Content Licenses be Free?
Are anti-DRM clauses a good idea? Are the current clauses merely badly drafted and an anti-DRM clause in general could be free? Or is any anti-DRM clause inherently non-free?
Älä käytä Creative Commons 1.0 -lisenssejä – käytä 2.5-sarjaa
The Finland version of the Creative Commons suite of licenses is still at 1.0. The 1.0 series of CC licenses has three serious known bugs (in Finnish)


Validation 2.0.
The Validator.nu HTML Parser
An implementation of the HTML5 parsing algorithm in Java.
Photo and Metadata Backup for Flickr
This is a photo and metadata backup utility for Flickr written as a self-contained Java command line tool. The metadata is written is an XML file whose format is an aggregation of the response data from the Flickr API.
Autozoom Extension for Firefox®
When Autozoom is activated, the current document is analyzed for the dominant font size and the view is zoomed by the factor that makes the dominant size match your font size preference.
Photo Group Feed
Flickr doesn’t provide feeds for private groups. It doesn’t provide feeds for comments on photos in a group, either. It is reasonable to want such feeds, so here’s a script that generates them on your HTTP server.
View Originl Bookmarklet
It takes way too many clicks to get from a Flickr photo page to the original JPEG file. I wrote a bookmarklet that does it with just one click.
Miscellaneous Java Code
Utility code.
CMS Stuff
Papers and code related to a CMS project.
SaxCompiler is a tool for recording SAX ContentHandler events as Java code that can play back the events without parsing XML.
A two-player asteroid shooting network game written in Java.
HTML Syntax Checker in PHP
An HTML linter written in PHP.
UTF-8 to Code Point Array Converter in PHP
This package contains a PHP include file which provides two functions for converting between UTF-8 strings and arrays of ints representing Unicode code points.

Blogish Notes

Julkisesti luotettu varmenne ikidomainille TLS:ää (SSL:ää) varten
Aiemmin ikidomainille, kuten hsivonen.iki.fi, on ollut vaikeaa saada julkisesti luotettua TLS-varmennetta. Uusi voittoa tavoittelematon varmentaja Let’s Encrypt tarkistaa isäntänimen (hostname) hallinnan ja mahdollistaa näin julkisesti luotetun varmenteen saamisen ikidomaineille. (English summary: Previously it was impractical to get a publicly trusted TLS certificate for an iki domain (e.g. hsivonen.iki.fi). Thanks to Let’s Encrypt performing validation on a per-hostname basis, it’s now practical to get a publicly trusted certificate for an iki domain.)
If You Want Software Freedom on Phones, You Should Work on Firefox OS, Custom Hardware and Web App Self-Hostablility
To achieve full-stack Software Freedom on mobile phones, I think it makes sense to focus on Firefox OS, commission custom hardware and develop self-hostable Free Software Web apps and an easy deployment platform for them.
Character Encoding Menu in 2014
This post is about a UI feature that I wish no one would have to use. Happily, it is indeed almost unused. Still, I made it more usable in the case when it is used. (The change was more driven by code removal than usability, though.)
Thoughts on HTML5 Becoming a W3C Recommendation
Since I’ve participated in the development of HTML5 for a decade now (since before it was commonly called “HTML5”), I’ve been asked for my thoughts about HTML5 becoming a W3C Recommendation. Hence, I figured I’d post something here.
Four Finnish Banks Training Users to Give Banking Credentials to Another Site
A person who turns to me for technical advice was logging in to government service using banking for a bank called Handelsbanken. However, the page that was asking for the Handelsbanken login credentials was not served from https://*.handelsbanken.fi/! After investigating what was going on, I decided to review how other banks in Finland handle this. Here are my findings.
What is EME?
It was suggested at the Mozilla Summit that there isn’t good information around about what Encrypted Media Extensions (EME) actually is. Since I’m on the HTML working group and have been reading the email threads about EME there, I thought that I could provide an introduction that explains things that may not be apparent from the specification itself.
Accept-Charset Is No More
Now that Firefox 10 has been released, none of the major browsers send only Chrome sends the Accept-Charset HTTP header.
WebM-Enabled Browser Usage Share Exceeds H.264-Enabled Browser Usage Share on Desktop (in StatCounter Numbers)
Looking at StatCounter stats, it occurred to me that they might not match the common narrative about H.264 market share. I decide to run some numbers using StatCounter stats.
Vendor Prefixes Are Hurting the Web
I think vendor prefixes are hurting the Web. I think we (people developing browsers and Web standards) should stop hurting the Web.
HTML5 Parser-Based View Source Syntax Highlighting
A new implementation of the View Source HTML and XML syntax highlighting has landed in Firefox.
The html5.parser.enable Pref is Gone
Just a quick note to Firefox nightly testers and bug triagers: I pushed a patch that makes Firefox no longer honor the html5.parser.enable pref.
Windows 8 App Support Matrix
Over the last few days, there’s been quite a bit of speculation about whether Windows 8 on ARM will ship the desktop environment and allow recompiled code written to the legacy Win32 APIs run.
The Old HTML Fragment Parser is Gone
Just a quick note to Firefox nightly testers and bug triagers.
Schema.org and Pre-Existing Communities
I have been reading tweets and blog posts expressing various levels of disappointment and unhappiness about schema.org not using RDFa, not using Microformats or not having been developed in the open with the community. Since other people’s perspectives differ from mine, I feel compelled to write down my take.
What Could Microsoft Do about IE6?
Microsoft has started a campaign to drive down the market share of IE6. Getting rid of IE6 is a righteous goal. Microsoft’s proposed solution isn’t righteous, though.
The Joy of about:blank
about:blank is probably the hardest Web page to load. In fact, it is so hard that in order to turn the HTML5 parser on by default in Firefox last year, we decided to special-case about:blank to use the old parser in Firefox 4.
Sergeant Semantics
So the W3C launched a logo for HTML5. And not just for HTML5-the-spec but for HTML5-the-buzzword. Regardless of the logo itself or what it stands for, I find the choice of the ancillary visual elements weird.
Vihreiden tekijänoikeuslinja ja teosten tekijöiden eläketurva
Vihreät julkaisivat äskettäin tekijänoikeuslinjapaperin. On positiivista, että puolue kiinnittää huomiota aihepiiriin niin paljon, että siitä julkaistaan erillinen linjapaperi. Minua kuitenkin häiritsee paperissa suhtautuminen teosten tekijöiden eläketurvaan. (English summary: I’m unhappy that the newly released copyright policy paper of the Finnish Green Party suggests that authors of copyrighted works should get royalties for the commercial use of the works they have created long after the creation of the work in order to get money in the pensioner age.)
HTML5 Script Execution Changes in Firefox 4 Beta 7
In Firefox 4 beta 7, script execution changed to be more HTML5-compliant than before. This means that in some cases sites that sniff for Firefox or Gecko may break. If your site/app works cross-browser without browser sniffing, you don’t need to read further. (However, if you triage bugs on bugzilla.mozilla.org, you might still want to read on.)
The spacer Element Is Gone
Today, I landed a patch that made the HTML5 parser in Gecko unaware of the HTML spacer element.
Apple took some of their Safari Technology Demos from their developer site and published them at http://www.apple.com/html5/ as an “HTML5 Showcase”. Christopher Blizzard's blog post about the subject says almost everything I'd have to say, so please read Blizzard's post. I'm posting just my diffs here.
SVG and MathML in text/html in Firefox and Validator.nu
I enabled SVG and MathML-related stuff recently on both mozilla-central and on Validator.nu.
HTML5 Parser Improvements
As mentioned earlier, there is an ongoing project for replacing Gecko’s old HTML parser with an HTML5 parser. Significant improvements have landed lately, so if you’ve previously tried the HTML5 parser and turned it off due to crashiness or Web compatibility issues, now is a good time to turn it back on.
Thou Shalt Not Spec a Feature that Might Inadvertently Compete with RDF when Used Contrary to How It Is Designed to Be Used
From the minutes of the TAG meeting on November 2nd 2009.
Speculative HTML5 Parsing Landed
As mentioned earlier, there is an ongoing project for replacing Gecko’s old HTML parser with an HTML5 parser. Today, a significant milestone landed: off-the-main-thread speculative HTML5 parsing.
Help Test HTML5 Parsing in Gecko
The HTML5 parsing algorithm is meant to demystify HTML parsing and make it uniform across implementations in a backwards-compatible way. The algorithm has had “in the lab” testing, but so far it hasn’t been tested inside a browser by a large number of people. You can help change that now!
An Unofficial Q&A about the Discontinuation of the XHTML2 WG
Many of the comments on Zeldman’s post indicate that there are people who are badly misinformed about the matters surrounding this announcement. To help remedy that, here’s some quick Q&A for getting informed.
Browser Technology Stack
I took a quick attempt at drawing a stack for Web browsing.
The Last of the Parsing Quirks
I implemented a single quirk for HTML5 parsing yesterday.
Testing HTML5 Parsing
I have been using a browser with an HTML5 parser for both my work and leisure browsing for a bit over a week now. I think in-browser HTML5 parsing is now ready to be tested by others as well.
Extended Uncertainty
I use myvidoop as my OpenID delegate. They used to have an EV certificate. Yesterday, they didn’t.
Out of Context
Last week on W3C mailing lists.
A Lecture about HTML5
I was invited to give a lecture about HTML5 on a course titled WWW Applications at the Department of Media Technology of Helsinki University of Technology.
SVG Filter Effects in HTML without External References
The project of putting an HTML5 parser inside Gecko has progressed. I merged in code from the trunk in order to experiment with cool new stuff such as SVG filter effects for HTML.
HTML5 Parsing in Gecko: A Build
The effort of putting an HTML5 parser inside Gecko takes a step out of the vaporware land.
I Want an Affordable Snapshot-Saving Crypto-Backupping RAID NAS
This week, I lost over one potential work day to HFS+. And it wasn’t the first time I’ve lost time to HFS+. I want to make arrangements to avoid losing time to HFS+ in the future.
Access Blocked
I followed a link from a message to a spec in the /TR/ space on www.w3.org.
Not Part of the Technology Stack
At XTech 2006, I got a W3C brochure entitled Leading the Web to its Full Potential that had a diagram visualizing the W3C technology stack(s).
Browser Sniffing History in the Chrome UA String
Google Chrome has the following cruft in the HTTP User-Agent header.
Introducing SAX Tree
I chose to write yet another XML tree package.
Lowering memory requirements by replacing Schematron
For long time, I’ve said is that the Schematron schema in the HTML5 facet of Validator.nu was merely a rapid prototype that should be replaced with custom Java code.
The Performance Cost of the HTML Tree Builder
I’ve been thinking about the performance gap between the Validator.nu HTML Parser and Xerces. What can be attributed to the “extra fix-ups” that an HTML parser has to do and what can be attributed to my code being worse than the Xerces code?
Performance Mistake
In the spirit of documenting one’s mistakes…
Validator.nu Gets Out of the Java Trap
This week, I upgraded the operating system on the Xen virtual machine that powers validator.nu and html5.validator.nu to Ubuntu Hardy.
Validator.nu Downtime
Validator.nu was down last week.
NVDL Support in Validator.nu
I enabled NVDL today.
ARIA in HTML5 Integration: Document Conformance (Draft, Take Two)
Now a runnable suggestion.
Security Quote of the Day
Cluelessness and incompetence of epic proportions.
ARIA in HTML5 Integration: Document Conformance (Draft)
This is not a spec and has not been endorsed by anyone.
Reality Distortion Fields
Where Joel Spolsky’s analysis of the IE version targeting issue goes wrong.
Almost Precedent
Why the Gecko Almost Standards Mode shouldn’t be used to justify IE engine version targeting.
Regular Expressions, Computer Science and Practice
Disregard of computer science can crash your app.
Unimpressed by Leopard
Sadly, Leopard is not a clear improvement over Tiger.
Built-in Accessibility Roles in HTML5
A quick table of WAI-ARIA roles and what HTML 5 provides natively for each role as of July 2007.
Printing Web Apps 1.0
This is a quick guide for getting a dead-tree version of the Web Applications 1.0 spec.
Speaking at XTech
I’ll be speaking at XTech.
IM Logs
Quote of the week.
EFFI’s Day in Court
As mentioned earlier, Electronic Frontier Finland (EFFI) was suspected of illegal fundraising. The case was tried today. I went to the court house to observe the proceedings.
XHTML and Mobile Devices
Simon Pieters’ mobile XHTML test results need more publicity.
Social Media Impression Management
I asked if they had researched the image formation of social media sites. They hadn’t.
DTDs Don’t Work on the Web
Last weekend, Slashdot linked to an article that observed that Netscape had removed the RSS 0.91 DTD. I hope this episode has a silver lining and helps in making people realize that DTDs don’t belong on the Web.
Thesis Defense on XForms
On Friday 2007-01-12, I went to listen to the thesis defense of Mikko Honkala.
Maemo Source Code
To save others the trouble of requesting the source, here are the contents of the package called “2.2006.39-14-srcs”.
Validator Web Service Interface Ideas
I am just writing this down so I don’t forget it.
Three Styles
Well, four styles if you count the original.
Charmod Norm Checking
Charmod Norm is still in the Working Draft state, but if it were to become a normative part of (X)HTML5, it would belong to the area of the conformance checking service that I am working on now, so I prototyped Charmod Norm enforcement as well.
Charmod Checking
Here’s how I have addressed the requirements of Charmod that apply to content (marked as [C] is Charmod).
Table Integrity Checker
The first non-schema checker prototype is a table integrity checker.
Openmind 2006
I attended Openmind 2006 last week. Here are some notes.
ISO Opens Up a Little
It turns out that ISO now has some standards on the Web. That’s good, but putting all of them there in a Web-friendly format would be even better.
Natural Hazards Again
Looking across the street, I can see that there’s something extra in the air between where I sit and the house on the other side of the street.
The Scientific Method According to Hixie
Quote of the week from the topic of #developers on irc.mozilla.org
What to Do with All These Photos?
I have a lot of photos that aren’t shared properly, which makes them less useful than they could be. Considering that it has been possible to publish photos on the Web for over a decade, I find it interesting and annoying how many unsolved problems there still are.
Aula 2006
Yesterday, I went to listen to the public speeches that were part of Aula 2006 – Movement.
HOWTO Establish a 100% Literacy Rate
This is one of my favorite pieces of West Wing script writing.
Need a Taxi at a Taxi Station? You Lose!
A taxi station is the worst place to be in Helsinki when you need a taxi (unless there’s one already there).
XTech 2006
I went to the XTech 2006 conference last week.
Europe Day
Tuesday 2006-05-09 was the Europe Day. I traveled to Tampere for a show debate.
So the Makasiinit burned today.
Comedy is the Real News
An observation I made last year when watching TV in the U.S.
Unused Icons
Unhelpful Microsoft wizardiness
Lists in Attribute Values
Whitespace-separation is good.
How Not to Advertise an Election Candidate
On Sunday and Monday elections were held at the local congregation in order to select a new vicar. I didn’t like the campaigning.
Bureaucracy Meets the Web
Three things from the past week happened to be related to bureaucracy and the Web…
Who knows prefixed XHTML from a hole in the ground?
Remember to test prefixed XHTML as well.
Atom Feed
I now have an Atom 1.0 feed.
RFC 2119 Key Words in Management Textbooks
Just a random observation about the vocabulary of management textbooks.
Big Brother EU
On Tuesday 2005-11-22, I went to a public discussion event titled “Big Brother EU”.
Thoughts on Using SSL/TLS Certificates as the Solution to Phishing
Comments on Staying Safe From Phishing With Firefox.
An Idea About Intermediate Language Trees and Web UI Generation
An idea about Web UI generation I had when I was studying compiler technology.
Natural Hazards: NA
Thoughts about nuclear power plants in stormy situations.
Names of Browser Engines
A table of browser names, engine names and script engine names.
HOWTO Spot a Wannabe Web Standards Advocate
I have seen this too often. (Aussi disponible en français; Auch vorhanden auf Deutsch; jest dostępny po polsku)
ISO-8859-15 on haitallinen
UTF-8 is the way to go. (In Finnish.)
10 Safari 1.0 issues
Hyatt requested lists like this.
Is Atom What We Really Need?
Atom (formerly known as Pie, Echo and Necho) has been created as a cleaner and better-defined alternative to RSS 2.0, which is underspecified. But is a reformulated version of RSS 2.0 really what we need?
Outlining the “Ultimate” Blogging Server
I’ve been thinking what a really good blogging system or a news site content management system would be like. Here’s my attempt at outlining the “ultimate” blogging server.


Kesäkoodi Wrap-Up – 2006-09-19
The last week of Kesäkoodi stretched to two sparse weeks.
On Clipboard Formats – 2006-09-15
This stuff is so underdocumented that it isn’t even funny. This document is written so that others might find something when they search the Web.
Week 35
The weekly report for week 35.
Week 34
The weekly report for week 34.
Speaking Gig – 2006-08-28
I have been booked to speak at the Openbyte pre-conference of the Openmind 2006 event in Tampere Hall on 2006-10-24.
Week 33
The weekly report for week 33.
Week 32
The weekly report for week 32.
Week 31
The weekly report for week 31.
Week 30
The weekly report for week 30.
Week 27
The weekly report for week 27.
Builds, Take Two – 2006-07-07
The builds have been respun with fixes for interrupting Expat properly.
Builds! – 2006-07-06
Now there is something to test. I am providing builds with my preliminary patches for four target platforms.
Oops! I broke MathML – 2006-07-05
Or, well, one could argue that it was already broken but my content sink changes and a suitably crafted test case just exposed the layout issues that were already there.
Week 26
The weekly report for week 26.
The Content Sink Inheritance Diagram – 2006-06-30
I have discovered that my previous diagram showed only a part of the inheritance graph below nsIContentSink. There is more.
Eclipse CDT – 2006-06-27
After working in TextWrangler (and a bit in XCode) for a couple of weeks, I really started to miss Eclipse.
Week 25
The weekly report for week 25.
Week 24
The weekly report for week 24.
Week 23
The weekly report for week 23.
Planning the XML Content Sink Incrementalization Work – 2006-06-10
I’ve been researching the problem area of bug 18333.
Week 22
The weekly report for week 22.
Week 21
The weekly report for week 21.
DOM Traversal Performance – 2006-05-26
But there is a problem. My JavaScript implementation is slow.
Kesäkoodi Starting – 2006-05-23
So what’s this Kesäkoodi thing about?


An Introduction to Unicode
PDF slides about Unicode.
W3C DOM -esittely
An introduction to the W3C DOM (in Finnish).

Lex Karpela

These documents are related to the amendments to the Copyright Act and the Criminal code which were passed in order to implement the EUCD in Finland and have been dubbed “Lex Karpela”.

Karpelan lukkovertaus ontuu
Anti-circumvention legislation does not make sense, and it is fallacious to compare circumventing DRM to breaking into an apartment. (In Finnish)
Mustaa valkoisella
A document request to the Ministry of Education. (In Finnish)

Articles in Need of Updating

Mac OS X Browser Comparison
This document is a rough yes/no feature comparison of the Web browsers that run natively on Mac OS X. It does not cover browsers that run on the Classic VM or require an implementation of the X11 windowing system. Severely out of date. For historical reference only!
Writing Structural Stylable Documents in Mozilla Editor
The Mozilla Editor is designed around HTML 4 Transitional. If special steps aren’t taken, it is easy to produce presentational documents that lack stylable structure. This document describes some basic good authoring practices for the purpose of writing structural and stylable documents.
About Points and Pixels as Units
A document about points being often mistakenly though as pixel units. Points are not pixel units. Defining the font size in points on Web pages is considered harmful. This document needs to be updated.
About the Hiragino Fonts with CSS
A short document about a couple of observations on using the Hiragino fonts with CSS. (The Hiragino fonts come with Mac OS X.)
XHTML—What’s the Point? (Draft, incomplete)
This document is incomplete, but I put it on the Web in order to avoid retyping the same thing over and over again in newsgroup discussions.
Things to Take into Account When Moving to Standards-Compliant HTML and CSS Authoring
This is a mixed collection of a few issues that are worth taking into account when writing Web pages according to the W3C Recommendations.


Imitating Reflective Caustics in POV-Ray
A tutorial on imitating reflective caustics in the official distribution of POV-Ray
Yet another ray tracing gallery page.