Hawaiian diacriticals

For those of you who haven’t had the pleasure of encountering the Hawaiian language, it’s a very simple but elegant language.  The written form is largely phonetic (cooked up by American missionaries in the 1800s) and makes use of two diacritical marks: the ‘okina, and the macron (also known as the kahako).

An ‘okina usually indicates a glottal stop, which is very important in the pronunciation of Hawaiian words.  The name Hawai‘i is a great example: the ‘okina indicates the name is pronounced hahwhy-ee instead of hahwhy. When you hear a native pronounce the name, there’s usually a very short hard pause between the why and ee syllables.

Unfortunately, the two Hawaiian diacriticals are not used by European languages, which means they’re difficult to accurately represent on a standard US qwerty keyboard. In most printed publications, the authors simply omit the diacriticals altogether — the very reason you usually see the name Hawaii, and not Hawai‘i.

Over the last decade, there has been an attempt by many well-meaning locals (Hawaiian and non-Hawaiian) to use substitute characters when true diacriticals aren’t available. While macrons are usually omitted (they don’t exist in most font sets), the ‘okina is often represented by a foot mark (‘), sometimes (mistakenly) referred to as a straight or neutral single quote mark.

An 'okina. Credit: Wikipedia (http://en.wikipedia.org/wiki/File:Hawaiian_okipona.png) This brings me to one of my pet peeves and the purpose of this post:  misuse of the backtick (`) character. Many of the previously-mentioned well-intentioned folks mistakenly use a backtick to represent an ‘okina, and it drives me absolutely bonkers.

As I mentioned to a friend of mine recently, a proper ‘okina is usually the same as left single quotation mark (‘), depending on the font. The shape of the ‘okina should loosely resemble the number 6. In HTML you can get this character by typing the entity ‘.

Granted, using entities is a pain for most people, and practically impossible in email and other electronic documents.  Substitutions will continue to be made.  I believe a foot mark (‘) is a more accurate depiction of an ‘okina than the backtick (`). It’s also easier to type and looks nicer.

Obfuscating email addresses, revisited

A while back, I posted my method for defeating spambots that harvest email addresses. This post is an update to that original method. It explores cleaner, less obtrusive code approaches and more accessible/usable HTML markup.

If you’re impatient and want to jump to some working examples, here you go:

The other “solutions”

So how do you prevent spambots from harvesting your email address? Well, there are a gazillion suggestions out on the interwebs, and unfortunately most of them stink because they require JavaScript, and because they often use illegible or invalid markup. For instance, this example — which was created by an email address obfuscator ranked high in Google searches — uses character entities to render the text completely illegible:

<a href="&#x6d;&#97;&#000105;&#108;&#116;&#000111;&#58;&#000116;&#x68;&#x69;&#000115;&#64;&#x73;&#x74;&#000105;&#x6e;&#x6b;&#x73;&#x2e;&#00099;&#x6f;&#109;"

This method has been popular for a number of years, but has some serious flaws. First of all, how do you know if you even have the right address in there? Secondly, what’s to stop a spambot from reading character entities? I imagine it would be as easy as reading ASCII or UTF. GONG!

Here’s another popular approach, premised on the notion that spambots look for any links using a mailto: protocol:

<script type="text/javascript">
   function emailme(user, domain, suffix){
      var str = 'mai' + 'lto:' + user + '@' + domain + '.' + suffix;
      window.location.replace(str);
   }
</script>;
<a href="javascript:emailme('this','stinks','com')">this@stinks.com</a>

There are multiple problems with this approach. The first problem is that it doesn’t use mailto: in the markup. This means if JavaScript is disabled, the link is completely useless. It also breaks the sematics of the links.

The second problem is that the JavaScript is inline and therefore obtrusive. JavaScript should not be mingling with your markup… it’s bad form! Any link that starts with javascript: is troublesome in my book.

Lastly, the whole address is still contained in the text of the page. If a spambot is sophisticated enough to look for mailto: protocols, it’s probably sophisticated enough to use RegEx to search for text that uses both @ and a period (.) without spaces.

There are other solutions out there, too, but they all require invalid markup, semantically incorrect markup, or flat-out removal of the email hyperlink. I want a solution that remains clickable when JavaScript is disabled, and doesn’t get all screwy with the markup. These don’t fit the bill. There’s another way.

A cleaner solution

My solution is simple: use an invalid email address. No, really! An invalid address with some extra touches and some unobtrusive JavaScript will work wonders. Here’s how to use it, step-by step:

Step one: Create your markup using a slightly altered address

Begin with a real address, then modify it to include some dummy text. For instance, the address sales@visitwaikiki.com would be rewritten salesnotspam@visitwaikiki.com. The spambot will harvest the address salesnotspam@visitwaikiki.com, which won’t work when the spammers try to use it.

The markup should look like this:


<a href="mailto:salesnotspam@visitwaikiki.com">sales@visitwaikiki.com</a>

There’s an obvious flaw here: The email address is still written in plain text between the ‘a’ tags. We’ll need to use alternate text — if you want to avoid spambots, NEVER use the real address as the visible text in an email hyperlink.

Using something such as sales AT visitwaikiki DOT com is also probably a bad idea, simply because zealous spambot authors can look for that very common pattern and manage to parse the email address. You’re best off using a different phrase, such as:

<a href="mailto:janenotspam@visitwaikiki.com">Contact Jane.</a>
<a href="mailto:salesnotspam@visitwaikiki.com">Email our sales department.</a>

We still have another problem to address: The link works, but it’s using the wrong address! The next step will help with that.

Step two: Improve the markup to make the link more usable when JavaScript is disabled

It’s always a good idea to ensure your visitor can use the email hyperlink when JavaScript is disabled. As it stands, when the visitor clicks the link, their operating system will create an email addressed to the invalid address salesnotspam@visitwaikiki.com. Without JavaScript, we can’t correct the address, but we can let the user know that the address needs to be edited.

<a href="mailto:salesnotspam@visitwaikiki.com?subject=EMAIL ADDRESS NEEDS EDITING&body=Please remove the text 'notspam' from the address before sending your email.">
   Email our sales department.
</a>

The mailto: protocol allows users to tack on additional information using the subject and body options. Whatever is listed after subject will appear in the email’s subject line. Whatever is listed after body will appear in the message’s body. By creatively using these options in the email address, we can clearly instruct the visitor to edit the address as-needed. The code above this paragraph produces the following email when clicked:

To: salesnotspam@visitwaikiki.com
Subject: EMAIL ADDRESS NEEDS EDITING
Message: Please remove the text ‘notspam’ from the address before sending your email.

Is it a pain to have to include the subject and/or body options each time you write an address? Yes. But is it more of a pain than the hundreds of spam emails you might get each week? I doubt it.

We now have a fully-functioning standards-friendly markup-only spam-resistant link. (Yes, I love hyphens. Don’t you?) Next, we’ll improve the experience for the 95% or so of your visitors who have JavaScript enabled.

Step three: Use JavaScript to make the link behave normally for most visitors

Most of your visitors will have JavaScript enabled; let’s take advantage of this and improve their experience. Our primary goal with our script will be to correct the invalid address by removing the dummy text “notspam”. However, since we’re removing the dummy text, we’ll also need to remove the instructions contained in the subject and body options so we don’t confuse the visitor.

Here’s a simple function that scans the page for all email links, then removes the dummy text (assuming all links use the same dummy text), the subject option, and the body option:

onload approach

window.onload = function (){
   var links = document.getElementsByTagName("a");
   for (var i=0; i < links.length; i++){
      if(links[i].href.indexOf("mailto:") !== -1){
         this.href = this.href.split("?")[0].replace("notspam", "");
      }
   }
};

Live demo

This teeny bit of JavaScript executes when the page loads and makes all email links behave as expected. Now we have a fully-functioning standards-friendly spam-resistant email link that also degrades nicely for visitors without JavaScript, and looks/feels completely normal to everyone else.

However, if you’re paranoid like me, you’ll wonder: What if the spambot supports JavaScript and looks for email addresses after the page has loaded? Your email address would be just as vulnerable as it was before.

A quick tweak to the script can help: instead of cleaning the addresses when the page loads, we can choose to only clean an address when the link is clicked.

onclick approach

window.onload = function (){
   var addressCleaner = function (){
      this.href = this.href.split("?")[0].replace("notspam", "");
      this.onclick = function (){};
      this.oncontextmenu = function (){};
   };
   var links = document.getElementsByTagName("a");
   for (var i=0; i < links.length; i++){
      if(links[i].href.indexOf("mailto:") !== -1){
         links[i].onclick = addressCleaner;
         links[i].oncontextmenu = addressCleaner;
      }
   }
};

Live demo

Note: all modern browsers treat a link as ‘clicked’ if you tab to it and hit enter on your keyboard, which means the link remains accessible to those using keyboard navigation and/or screen readers.

Also, notice the oncontextmenu code; when a link is right-clicked, the onclick event isn’t triggered. If a person right-clicks the email address to copy it, they would be copying the invalid version of the address. Using the oncontextmenu event fixes this problem.

You’re done!

You now have a spam-resistant email hyperlink that works whether JavaScript is enabled or not. It adheres to standards (no invalid markup), is semantically correct, and is unobtrusive.

Having said that, you should be aware that this system is not perfect; spammers are very clever, and will always catch up to us. This method is a form of spam resistance, not a foolproof way to defeat all spambots from now until eternity.

While the code you’ve just seen will work fine for most people, there are a few improvements that can be made with the use of a JavaScript framework. If you don’t use a JavaScript framework such as MooTools or jQuery, your journey has ended. If you do use a framework, let’s explore some potential improvements to the system.

Improvements via frameworks

JavaScript frameworks add some impressive tools to our toolbox and provide many conveniences. For this example, I’m going to use MooTools 1.2, but most other frameworks will have similar code that you can adapt for your own needs. Here are some improvements we can make:

  1. Use event handlers instead of direct assignment.
  2. Use a domready event instead of window.onload.
  3. Use CSS selectors and the array:each method

Here’s the improved code, modified to use MooTools 1.2:

window.addEvent("domready", function(){
   var addressCleaner = function (){
      this.href = this.href.split("?")[0].replace("notspam", "");
      this.removeEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   };
   $$("a[href^=mailto:]").each(function (a){
      a.addEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   });
});

Live demo

Explanation of the MooTools framework version

Since some of you may not be familiar with frameworks, so I’ll try and explain the changes I’ve made.

Event handlers

Most JavaScript gurus will tell you that using event handlers is a much more robust approach than using a direct onclick assignment. For starters, adding an onclick event using direct assignment will overwrite any existing onclick event. Using an event handler will ensure the new event will not destroy any existing events, and will simply add the new event to a queue of events.

//Direct assignment
a.onclick = function (){
  //Do something
};

//MooTools event
a.addEvent("click", function (){
   //Do something
});

As you can imagine, if you don’t use a framework, browser support and cross-browser incompatibility issues make event handlers a bit of a pain. This is one of the primary reasons frameworks have become so popular: they take the pain out of cross-browser compatibility.

Change window.onload to a domready event

The domready event is executed earlier than an onload event. domready basically means that all markup has loaded into the browser DOM, even if images and other media haven’t finished downloading yet. onload, by comparison, only fires after everything has finished loading. A MooTools domready event looks like this:

window.addEvent("domready", function (){
   //do something
});

Use CSS selectors and the array:each method

MooTools allows us to replace document.getElementsByTagName with much more targeted CSS-based selector: $$("a[href^=mailto:]"). This selector finds all links on the page whose href attribute begins with mailto:, then places the results in an new array. This means we can ditch two elements of our original script: the call to

document.getElementsByTagName("a")

and the if syntax inside the loop:

if(links[i].href.indexOf("mailto:") !== -1)

Next, we can replace the for loop with an each method, which performs whatever action is specified to each of the items in the array.

myArray.each(function (arrayitem){
   //do something with arrayitem
});

The each array method is native to browsers not named Internet Explorer. Frameworks like MooTools and jQuery bring support for this function to browsers that don’t natively support it.

Now that we’ve got our CSS-based selector working with the each method, we can greatly simplify our code:

window.addEvent("domready", function(){
   var addressCleaner = function (){
      this.href = this.href.split("?")[0].replace("notspam", "");
      this.removeEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   };
   $$("a[href^=mailto:]").each(function (a){
      a.addEvents({
         "click": addressCleaner,
         "contextmenu": addressCleaner
      });
   });
});

Tips

  • You can place the dummy text in any part of your email address, not just the username portion. For instance, you could do sales@visitSOMEWHEREOTHERTHANwaikiki.com, sales@visitwaikiki.commie, etc.
  • It’s probably a good idea to use dummy text other than the common phrase “nospam”; authors of spambot software can easily look for these phrases as keywords and use them to target your address. Get creative with your dummy text, just be sure it’s obvious to a human reader that the text needs to be removed.
  • If you have multiple email addresses on the page, this method requires that you use the same dummy text in all email addresses.
  • Be sure you change the dummy text in the JavaScript function to match whatever text you decide to use!

Known Issues

When JavaScript is disabled and someone copies/pastes the email address instead of clicking it, they will be copying the invalid version of the address. To minimize problems, you can write the address in a hard-to-miss way, such as using all caps for the dummy text (salesNOTSPAM@visitwaikiki.com). This will be an extremely small percentage of users, so I wouldn’t worry too much; if they’re savvy enough to disable JavaScript and use copy/paste for email addresses, they’ll probably read the address, too.

This email address obfuscation method has been successfully tested in the following browser/OS combinations:

  • Firefox 3.0 (Mac OS X, Windows Vista)
  • Safari 3.2.1 (Mac OS X, Windows Vista)
  • Internet Explorer 6 (Windows XP)
  • Internet Explorer 7 (Windows Vista)
  • Internet Explorer 8b1 (Windows 7 beta)
  • Opera 9.6 (Mac OS X, Windows Vista)
    • One issue in Opera: The contextmenu menu event doesn’t trigger correctly when right-clicking

Fixed-width layouts

While working on a recent web project at work, I wondered if I should go for a fixed-width layout or stick with my preference for fluid layouts. Fixed-width layouts are certainly easier to manage, but they just feel so… rigid. With the boom in larger monitors, I also wondered if fluid sites start presenting a problem due to being too wide. I decided to check around the web to see what others are doing.

The search

In my (very quick and unscientific) research, I visited 150 popular sites, including news sites, shopping sites, software company sites, and the personal sites of well-regarded web designers and developers. I purposely avoided blogs that use canned themes.

When visiting the site, I determined the page’s width by reducing the browser window’s width until the horizontal scrollbar appeared. In some cases, I didn’t use the front page of the site, as these are sometimes standalone pages that use a completely different layout than the site’s actual content.

The results

Of the 150 sites I visited, only thirteen used fluid layouts — a whopping 8.6%.

A number of those thirteen sites weren’t completely fluid, as they either break at smaller sizes or have fixed-width sidebars, but I included them in the list since they aren’t using a fixed-width wrapper.

I was genuinely surprised about the number of fixed-width sites; a sizable chunk of these sites belong to (or were designed by) well-known web design gurus. I had assumed a number of them would use some sort of min-width/max-width CSS trickery, but I was wrong — only a very few sites (less than 3.5%) used this approach.

I was also surprised to see that the vast majority of the fixed-width sites were between 900 and 1000 pixels wide, a size that was unthinkable a mere 5 years ago.

Observations

  • Most of the fixed-width sites fell between 900 and 1000 pixels wide, and were usually centered in the browser
  • 20 sites were 800 pixels wide or less
  • 14 sites were between 800 and 900 pixels wide
  • 100 sites were between 900 and 1050 pixels wide
  • 3 sites were over 1050 pixels wide
  • the widest site was time.com at over 1100 pixels, although the layout uses a fixed-width wrapper wider than the content itself (the content was closer to 1000 pixels).

Sites visited

I picked sites somewhat randomly, using a combination of Alexa.com rankings, sites I frequent, sites my co-workers like to visit (hello, www.bebe.com), sites belonging to famous companies (McDonalds, Target, etc.) and sites belonging to well-known web designers and developers. I know I left out a number of good sites, but this was a very quick-and-dirty project.

Fluid layouts

  1. allinthehead.com (Drew McClellan)
  2. clearleft.com
  3. craigslist.org
  4. danwebb.net
  5. dean.edwards.name
  6. drupal.org (partially broken due to floating elements overlaying other elements)
  7. gmail.com
  8. htmldog.com (Patrick Griffiths)
  9. joeclark.org
  10. meyerweb.com (Eric Meyer)
  11. molly.com (Molly Holzschlag)
  12. people.opera.com/howcome (HÃ¥kon Wium Lie)
  13. wikipedia.org

Fixed-width layouts

  1. 24ways.org
  2. 37signals.com
  3. 456bereastreet.com (Roger Johansson)
  4. about.com
  5. adactio.com (Jeremy Keith)
  6. adobe.com
  7. alexa.com
  8. alistapart.com
  9. amazon.com
  10. americanexpress.com
  11. andybudd.com
  12. anthropologie.com
  13. aol.com
  14. apartmenttherapy.com
  15. apple.com
  16. authenticjobs.com
  17. bankofamerica.com
  18. barackobama.com
  19. barnesandnoble.com
  20. bbc.co.uk
  21. bebe.com
  22. bestbuy.com
  23. blogger.com
  24. borders.com
  25. boxofchocolates.ca (Derek Featherstone)
  26. burgerking.com
  27. cameronmoll.com
  28. cartoonnetwork.com
  29. cbs.com
  30. chase.com
  31. clagnut.com (Richard Rutter)
  32. cnet.com
  33. cnn.com
  34. comcast.com
  35. comcast.net
  36. crateandbarrel.com
  37. danbenjamin.com
  38. dean.edwards.name
  39. dell.com
  40. deviantart.com
  41. dictionary.com
  42. digg.com
  43. directv.com
  44. dojotoolkit.com
  45. dustindiaz.com
  46. ebay.com
  47. espn.com (MLB page)
  48. facebook.com
  49. fastcompany.com
  50. fedex.com
  51. flickr.com
  52. fox.com
  53. friendster.com
  54. gamespot.com
  55. go.com
  56. guardian.co.uk
  57. guitarhero.com
  58. happycog.com
  59. haveamint.com
  60. hicksdesign.co.uk
  61. home.live.com
  62. hulu.com
  63. iht.com
  64. ikea.com
  65. imdb.org
  66. jasonsantamaria.com
  67. jeffcroft.com
  68. jetblue.com
  69. jquery.com
  70. kaiserpermanente.com
  71. latimes.com
  72. linkedin.com
  73. livejournal.com
  74. m-w.com
  75. macys.com
  76. markboulton.com
  77. mcdonalds.com
  78. mediatemple.net
  79. mezzoblue.com (Dave Shea)
  80. microsoft.com
  81. mootools.net
  82. mozilla.com
  83. msn.com
  84. myspace.com
  85. nbc.com
  86. neopets.com
  87. netflix.com
  88. newegg.com
  89. nfl.com
  90. ning.com
  91. nintendo.com
  92. npr.org
  93. nytimes.com
  94. opera.com
  95. oreilly.com
  96. paypal.com
  97. pbs.org
  98. quirksmode.com (Peter-Paul Koch)
  99. reuters.com
  100. rockband.com
  101. rollingstone.com
  102. secondlife.com
  103. sfgate.com
  104. shauninman.com
  105. si.com (MLB page)
  106. simonwillison.net
  107. simplebits.com (Dan Cederholm)
  108. sitepoint.com
  109. skype.com
  110. snook.ca (Jonathan Snook)
  111. sony.com
  112. stuffandnonsense.co.uk (Andy Clarke)
  113. target.com
  114. techcrunch.com
  115. theonion.com
  116. time.com
  117. tivo.com
  118. twitter.com
  119. typepad.com
  120. ups.com
  121. urbanoutfitters.com
  122. usaa.com
  123. usps.com
  124. veerle.duoh.com (Veerle Pieters)
  125. virgin.com
  126. wait-til-i.com (Christian Heilmann)
  127. wamu.com
  128. washingtonpost.com
  129. weather.com
  130. wellsfargo.com
  131. whitehouse.gov
  132. williamssonoma.com
  133. wordpress.com
  134. yahoo.com
  135. yelp.com
  136. youtube.com
  137. zeldman.com (Jeffrey Zeldman)

Link: Web Accessibility Checklist

The talented Cameron Moll has posted a link to a Web Accessibility Checklist prepared by Aaron Cannon, a (blind) member of his web development team.

Aaron’s checklist is an easy-to-understand list of accessibility dos and don’ts. Most of these are so simple and easy to implement that there’s really no excuse to NOT use them in your work!

Kudos to Aaron and Cameron for sharing this with the community!

Link: Hardware tips for screencasting

Ran across this short but useful blog entry from Layers Magazine.

I know many people who use assorted ‘screencasting’ tools (Captivate, Camtasia, Firefly, etc.), and my guess is that very few of these people give much thought to the hardware they use for their projects. Hardware has a huge impact, and can be the difference between a successful screencast session and a computer that keeps crashing.

This author gives a nice simple overview of the topic, and also gives some practical tips about creating screencasts. Read “Screencast Success

SWFObject is officially at 2.0

Geoff Stearns and Bobby van der Sluis have finalized SWFObject 2.0. It is no longer beta, and SWFObject 1.5 is now considered deprecated.

SWFObject 2.0’s home is located at Google Code, which includes full documentation and downloads. I believe support will be handled by the SWFObject 2.0 Google Group. I’m not sure what will be happening with the SWFObject Support Forum, which has focused on the pre-2.0 versions of SWFObject.

Thank you to Geoff and Bobby (and the late Michael Williams) for all their hard work… SWFObject 2.0 will certainly have a major impact on the Web.

WCAG Samurai Errata for Web Content Accessibility Guidelines (WCAG) 1.0 released

The WCAG Samurai Errata for Web Content Accessibility Guidelines (WCAG) 1.0 [link no longer available] were published this week by the WCAG Samurai group. They don’t contain anything I’d consider Earth-shattering, but there are some very solid guidelines that bring the 1999 WCAG 1.0 specs a little more in-line with our current state of the browser (no revelations on how to make ajax more accessible, though!).

Check out the summary on the introduction page.

I found their information about the Brewer Palette [link no longer available] particularly interesting; Cynthia Brewer “conducted research into creating maps that people with colour deficiency (colourblindness) can read and understand.”

Roger Johansson has a nice explanation of why the Samurai group formed and published this Errata documentation.

Accessibility development tools

There are a great set links for free development tools (validation services, browser toolbars and plugins) posted on the Web Access Centre Blog today:

Looking for alternatives to Bobby and WebXact? Try these!

Anyone familiar with accessibility should already know about Cynthia Says and a few of the web-based validation services… what I was impressed with were the links for the browser add-ons, specifically the Web Accessibility Toolbar for IE. It’s very similar to Chris Pederick’s popular Web Developer Toolbar extension for Firefox (which I use religiously), and is a nice upgrade from Microsoft’s ho-hum IE Developer Toolbar. Lastly, Jon Gunderson’s Firefox Accessibility Extension is another great Firefox add-on.

Check out the other links mentioned in the blog post, and the Web Access Centre’s site when you have time.