URL Encoding and Quoting

I've just committed a change to the Trac [[Image]] macro that allows more flexible input for URL locations (see trac:changeset:6413).

Having to accept and handle a wider range of input made me start thinking about the security implications and how this could be abused. That got me into a vicious circle of pin-pointing possibilities, how and what to encode and quote, and how to handle some inconsistencies between the various inputs and types. Not good - and way out of scope for something that in theory was a simple change to the macro.

I won't bore with details here, but instead skip right to the conclusion:

  • Quoting of URLs is done by browsers automagically, and for HTML it is generally not needed anymore.
  • HTML escaping all content is plenty enough - it will ensure that a double quote (") in a URL will show as " and not actually close the attribute.

Knowing that, the implementation is as simple as just taking the input and sending it off to rendering. However, it turned out that the Trac url-builder (trac.web.href.Href) quotes the input, so the simple solution was to unquote it and just let HTML escaping look after it - as done default by Genshi used by Trac for rendering.

In the end, the most 'complicated' line turned out as simple as it gets:

        # use href, but unquote to allow args (use default html escaping)
        raw_url = url = desc = unquote(formatter.href(filespec))

Learned something (again).

  • Posted: 2008-01-24 13:54 (Updated: 2008-01-26 00:31)
  • Categories: trac


No comments.