Joho the Blog » Free the space!

Free the space!

I’m sure I’m heading for a D’oh! moment, but, much as I enjoy the ambiguity of the URL www.lumberjacksexchange.com, why aren’t spaces allowed in Web addresses? URLs are already delimited by quotation marks in HTML markup, as in <a href=http://www.lumberjacksexchange.com>. In fact, couldn’t we make a rule that whatever is the first character after the “href=” is the delimiter, a tactic I learned about when I worked at Interleaf? That way, you could even include quotation marks in the address, as in <a href=|http://www.lumberjacks exchange.com/call me “Carla”.html|>.

Allowing spaces and flexible delimiters would let us express URL’s in ways humans can more easily understand. After all, should Web pathnames be harder to read than Windows pathnames?

In fact, when we need to make it clear that we’re expressing a path and not a space-delimited series, we could learn from Windows’ conventions: Use quotes as delimiters for paths such as “C:\My Programs\Whirligig Anti Virus Pro\read me.txt.” Having to use explicit delimiters on paths on occasion seems to me a small price for being able to use spaces as delimiters between words.

Now, what is the big point I’m missing that’s so obvious that I’m about to go D’oh! ? [Tags: a href="http://www.technorati.com/tags/everything+is+miscellaneous" rel="tag"> html ]

9 Responses to “Free the space!”

  1. “what is the big point I’m missing”: Worldwide infrastructure.

    Right now, ICANN is dealing with the fact that despite the Internet being “international,” 10,000s of characters/ideograms/etc in most non-Roman languages cannot be represented. Browsers allow the use of hundreds of characters in domain names by using something called punycode, that rewrites certain characters into a special 26-Roman-plus-hyphen-plus-numerals format that DNS servers can interpret and that registrars must conform with.

    So you have multiple plumbing challenges. Were you to want the space character to be legal in the domain name part of a URI, then you need to be part of the ICANN process and make sure that it gets in there. Otherwise, you’d have to use special browser extensions to do weird mapping and such on the browser side and only browsers with that extension (or browsers that adopt the approach) would handle that.

    On the local side of the URI, after the domain name, you can use a whole slew of characters; your browser will encode them. But the Web server has to interpret them correctly. It’s possible to name local files with quotation marks and spaces in them (depending on platform), and the browser can encode that information into a path that can then be read.

  2. Special characters need to be encoded:

    http://www.lumberjacks%20exchange.com/call%20me%20%22Carla%22.html

    And, what Glenn said above – though that relates to non ASCII characters.

  3. You could enter the encoded space (%20), but you would still be precluded from typing the url with just the space character. If the browser just made the assumption that spaces should be encoded then all would be well.

  4. Propose enabling the use of the underscore, and then propose that browsers display the underscore as a space in the address bar (optionally disabled). Also that users entering spaced URLs have the URLs rewritten with underscores.

  5. URLs are not just used in HTML but also in *lots* of other places. Take email as an example. Isn’t it nice that your email client makes URLs clickable in emails you receive? That wouldn’t be possible if spaces were allowed in URLs, the application wouldn’t know how to decide if the next word after the URL still belongs to the URL or not.

    The other thing is that changing the basic specifications of the internet and web is practically impossible beccause of all the existing investment. Same reason why we still use QWERTY keyboards.

  6. And I thought the point was to make the whole thing more obscure to the hoi polloi!

  7. Isn’t it nice that your email client makes URLs clickable in emails you receive?

    No, it’s annoying. I’d rather it left the damn things alone. I could select and drop ‘em on the tab bar if I wanted to open ‘em.

    And wouldn’t the “first character after ‘href=’ is the delimiter” rule fix that for email clients anyway?

  8. I don’t think forcing people to start learning about the concept of delimiters, and to force them to start using delimiters makes things easier.

    Allowing spaces and flexible delimiters would let us express URL’s in ways humans can more easily understand. After all, should Web pathnames be harder to read than Windows pathnames?

    Can I get paid for every hour of work that the easy-to-read Windows pathnames have cost me?

    If all weblinks were associative queries there wouldn’t be a problem. Rather than linking to benandjerries.com you’d link to google.com?s=ben%20and%20jerries. As it is, already a lot of people use their Google input field instead of their address bar to get places. But sometimes you just want a specific document; having the least amount of ambiguity helps in such cases.

    (Isn’t the plural of URL URLs, by the way? Just curious — I am not a native speaker of English.)

  9. why aren’t spaces allowed in Web addresses

    In short: because Sir Tim said so, and the rest of the world agreed.

    In fact, couldn’t we make a rule that whatever is the first character after the “href=” is the delimiter

    We could, but there already is such a rule, so why bother?

    Allowing spaces and flexible delimiters would let us express URL’s in ways humans can more easily understand.

    As I said in my previous comment, allowing spaces at the cost of adding delimiters peels off a layer of complexity at the cost of adding another one. It is not clear to me how that is a win. The only reason I can think of is that “regular” people more easily understand the rule Always Use Delimiters than they understand Never Use Spaces. The reason for that would be that most people use MS Windows, which allows (but not obligates) delimiters, and allows spaces.

    Unfortunately, delimiters in Windows are only really useful when you are working with the command line interface, and I think that a) only a small subset of Windows users use the CLI, and b) the sort of people who use the CLI are typically computer savvy people who are capable of remembering a rule Never Use Spaces.

    Perhaps what you really meant was: why can’t I type in “coca cola” in my address bar and go to the Coca Cola website? The answer to that is that that depends on your web browser.

    BTW, as Glenn Fleishman notes, ‘they’ are working on domain names that allow international characters, and the big challenge for browser makers and browser users alike is that this opens the door for spoofs. This goes for spaces too: in some contexts, they differ little from underscores. If both are allowed in domain names, this makes it harder to detect spoofs. That’s not the fault of spaces, but it shows that using a small set of allowed characters makes it easier to detect spoofs.

    (My expectation: a huge number of characters will be added to the set that is currently allowed for domain names, and browser makers and users will have to get used to a world where phishing is easier.)

Leave a Reply


Web Joho only

Comments (RSS).  RSS icon

Switch to our mobile site