April 26, 2012

Learning and Grammatical Forgiveness

HTML is a very interesting machine language because, like human languages, most things that interpret it are very forgiving.

For instance, did you know that the following HTML is technically invalid?

<video>
  <source src="movie.mp4"></source>
</video>

It’s invalid because <source> is a so-called void element: since it can’t have any content inside it, you simply don’t need a closing tag for it. The <img> tag works the same way. The technically correct way to write the above HTML snippet is as follows:

<video>
  <source src="movie.mp4">
</video>

However, in practice, all Web browsers will interpret both of these snippets the exact same way. When a browser sees the closing </source> tag on the first snippet, it realizes the “mistake” the author has made, and simply pretends it isn’t there.

What’s interesting to me is the way this mirrors human languages, and what it means for teaching. For instance, the following sentence is grammatically incorrect:

The dog loves it's owner.

However, no one who knows English will actually be confused by the meaning of the statement.

When I was trained as an adult literacy tutor several years ago, one of the most important principles we were taught was that fostering a love for writing was vastly more important than grammatical correctness. The “red pen” commonly used by school teachers for correcting grammatical errors was seen as anathema to this: when we found a grammatical error in a novice writer’s work, we were encouraged to ignore it unless it actually made the piece confusing or ambiguous for readers in a way that the author didn’t intend. Otherwise, the novice writer would become quickly distracted and discouraged by all their “mistakes” and view writing as a minefield rather than a way to communicate their thoughts and ideas.

We’re running into similar issues in the design of the Webpage Maker. On one hand, the fact that Web browsers are so forgiving when interpreting HTML enables us to follow a similar philosophy as that of progressive adult literacy tutors.

But sometimes, the forgiving nature of Web browsers backfires: they actually render a document that is vastly different from the author’s intent, which is just as frustrating as a pedantic nitpicker. We’ve created a library called Slowparse—soon to be renamed—which attempts to assist with this, providing the logic needed for a user interface to gently inform users of potential ways their HTML and CSS code might be misinterpreted by machines. A full specification of errors and warnings is also available, as is an interactive demo that uses the library to provide real-time feedback to users.

It’s been interesting to see how different Slowparse is from a HTML/CSS validator, whose goal is not one of learning, but of ensuring conformance to a specification. From a learning perspective, a validator is like the pedantic teacher who loves their red pen: some of its feedback is quite useful, but the remainder is likely to confuse and intimidate a newcomer.

Partly as a result of its learning goals, Slowparse actually “warns” the user of things that are technically valid HTML/CSS, but which likely don’t reflect the intent of the author. One current example of this is in regards to the use of unquoted attributes in HTML5, though that particular example is still subject to change.

At this point, I think the challenge will be to work with our learning team and user test our interface to the point that we achieve a good balance between being a pedantic nitpicker and providing useful feedback that helps users as quickly as possible. In my opinion, if we do things right, we’ll help people develop a love for HTML and CSS—even if what they write may technically be “grammatically incorrect.”

© Atul Varma 2021