Archive for the ‘Development’ Category

Semantic uses of <img> HTML tag

Saturday, November 28th, 2009

What is it for?

According to W3C:

“The IMG element embeds an image in the current document at the location of the element’s definition.”

It is pretty clear therefore that we can use <img> tag in order to put images into our web pages.

This tag at one point revolutionised the web, as it brought images to it.

As time has moved on images and video dominate the web and the above simple guideline from W3C is nowhere near enough to tell us how we should be using this tag.

Appropriate uses

For content, not design

<img> tag should be used only for images which are directly related to the content of the given web page.

Any design related imagery, such as background tiling or component wrappers should be coded as part of an external CSS file and included via background CSS property.

Image source locations

Images are files which reside on server of some sort in order to be accessible by the end user.

In recent years it has become commonplace to use CDNs such as Flickr to serve all images within web pages.

There are known advantages and disadvantages to this approach.

The main advantage is that serving images can be much faster, as the CDN will have servers in particular country where the end user is based, hence serving of images will be faster for each user.

The main disadvantage is that images are outside the domain within which the given web site resides, making developers have to rely on the CDN to be up and running for the images to work within web pages.

Another disadvantage is that businesses are often concerned with giving their content away to third parties.

Main concern is that if the given CDN increases price of hosting, it may become more expensive to host images externally than internally.

There is also a big element of faith in hoping that the given CDN will be up and running forever (i.e. what happens if Yahoo goes bust and Flickr stops existing as a service?).

Content manageability

On large scale web sites which use teams of content editors, it is very important to consider those people within the work flow of a web site maintenance and updating processes.

Content editors usually have familiarity with HTML code, but not CSS, hence coding images as HTML backgrounds can sometimes create maintenance problem for non-technical staff.

Sometimes it is better to use <img> tag instead of CSS background in order to give content editors greater control over the content on the web site.

However these should be considered on individual bases through constructive communication and business interests should be taken into account (this usually means making sacrifices towards maintainability by non technical staff over semantics of the page).

Inappropriate uses

For layout

It was common in the past to use various ‘spacer’ images in order to control layout of web pages.

These spacers usually came in the form of a 1×1 pixel GIF image, which was stretched by developers using width and height attributes within the <img> tag.

This practice should be avoided by all means, as it unnecessarily bloats up the web page and creates very hard to maintain pages.

This practice is also bad because it misuses <img> tag for layout purposes rather than to bring good quality content into a web page.

Not using alt attribute

Images have accessibility issues associated with them – blind people and search indexers cannot see them.

This is the reason why there is an alt attribute present within <img> tag.

Sometimes however, in rush or through simple negligence, developers do not use alt attribute appropriate or at all.

alt attribute should always be used and should describe the image in the best way possible.

You should not be using an empty alt attribute simply to satisfy W3C code validator and make your code validate against the guidelines.

Using appropriate alt text is also important from Search Engine Optimisation perspective, as it is usually very relevant to have your web site images show up high in various search engine search results pages.

Semantic uses of <label> HTML tag

Monday, June 15th, 2009

What is it for?

W3C states quite clearly that:
The <label> element may be used to attach information to controls. Each <label> element is associated with exactly one form control.

The for attribute associates a label with another control explicitly: the value of the for attribute must be the same as the value of the id attribute of the associated control element. More than one <label> may be associated with the same control by creating multiple references via the for attribute.

Few interesting points from the above quote for developers to understand are:

  • <label> may be used which implies that strictly speaking it is not needed, although I will strongly argue that not using <label> with form elements would be very bad practice
  • One <label> can only label one form control, so it is impossible to use one <label> to label many form controls
  • Developers can use more than one <label> in order to label any given form control, which is potentially a very powerful construct

Appropriate uses

Generally speaking the most common scenario when using <label> element is the following type of association:

<label for="username">Username or email</label>
<input type="text" name="username" id="username" />

Each form field should have a label associated with it in order to achieve proper accessibility and usability.

Inappropriate use

Not using labels with form fields

From Google

<div class="section_error">
<input type="checkbox" onclick="GWO_updateContinueButton(this)" value="yes" name="ubac_accept"/>
<strong>Yes,</strong> I accept the above Terms and Conditions.
</div>

In the above code snippet Google exemplifies some of the worst practices in User Interface development.

Apart from using non-semantic class name of ‘section_error’ for a non-error content snippet, they are writing in-line JavaScript, which is harder to maintain and does not achieve separation of concerns.

Most importantly Google are not using any labels for the check box, hence they are damaging accessibility and usability of the code snippet as there is no explicit label associated with the check box to denote what the check box is for.

Considering that this code snippet finds itself in context of a very simple documentation page, there are no excuses why Google’s engineers should ever produce this bad quality of User Interface code.

A re-worked, appropriate, semantic solution for this code snippet would read something along these lines:

<div id="termsAndConditions">
<input type="checkbox" value="y" name="termsOptIn" id="termsOptIn" />
<label for="termsOptIn"><strong>Yes,</strong> I accept the above Terms and Conditions.</label>
</div>

If really required, JavaScript functionality associated with this snippet should be situated in an external JavaScript file which associates the functionality by hooking onto to the termsAndConditions ID attribute.

Progressive enhancement

Thursday, June 11th, 2009

So far I have touched on the topic of emerging semantic web structure, outlining the appropriate way in which we should layer web technologies.

This approach is often refered to as ‘progressive enhancement’.

It is the development approach designed to make sure that a web page works no matter which end user client it is accessed by.

When using progressive enhancement, you should make sure you code up your (X)HTML as semantically as possible first, then style things up with CSS and at the very end add JavaScript behaviours or AJAX enhancements.

This approach ensures that a web page is fully accessible even when JavaScript is not present or not fully supported by the client accessing the web page.

Progressive enhancement is more work?

Following progressive enhancement approach, however, can prove to be much more cumbersome and time consuming, compared to not following it.

This is especially the case when building web applications, such as GMail or Google Calendar.

Web applications can often require two parallel solutions to be engineered for circumstances where JavaScript is available and for circumstances where it is not.

This is one of the reasons why GMail, for example, has an ‘HTML only’ version available as a separate link, while the main version of GMail does not work with JavaScript switched off.

Another likely reason why GMail does not work without JavaScript is the fact that Google engineers use GWT (Google Web Toolkit written in Java) to create all their interfaces with.

GWT spews out user interface code which completely breaks the above outlined good practices, yet creates code which still works cross browser.

Since Google engineers tend to think about back end logic of applications as a matter of priority, their outlook on web application development can be described as graceful degradation.

Graceful degradation

Unlike progressive enhancement, graceful degradation approach assumes all technologies are supported and ensures that users have best possible experience when they are enabled.

Once this is achieved, the considerations are given towards circumstances where certain technologies are not supported and how to deal with them.

Graceful degradation is arguably more appropriate approach in the short term, as the main business objective is to create a working interface for the main user group as soon and as cheaply as possible, while serving the minority user group if and when possible.

This is how most of the very profitable web sites have been developed.

Hard to maintain software is dead software

In the long term, however, graceful degradation approach will more than likely mean that a given application is much harder to maintain and update, as the code is not developed according to semantic standards.

This can mean that future development and progress can only be achieved by completely re-writing a solution from scratch, which can lead to long down times, difficult migration processes, massive costs of re-development, stifled innovation and many other unwanted side-effects.

An extremely good example of this is MySpace, which is one of the worst applications ever developed, which has seen very little to no improvements in the last few years. This is more than likely due to the poor quality of the initial development and extremely bad User Interface code quality.

MySpace’s poor quality user interface code also means it is very tricky to create nice skins for MySpace profile pages using CSS, as the underpinning HTML code is very poor.

This means that MySpace’s very sales pitch of enabling users to customise their profiles however they like is flawed due to the inflexibility in the coding deployed by MySpace developers.

Using graceful degradation also creates a harder environment for developers of various skill sets to work within, as within Google setup any developer working on GMail will have to be a wizard in GWT.

In semantically developed interfaces (i.e. progressively enhanced) a company can employ a relatively cheap CSS developer to develop themes for their CMS or web application without ever needing to touch HTML or JavaScript.

In this environment the company developing the solution has the ultimate flexibility and control over their toolsets.

In progressively enhanced interfaces it is also much easier to delegate work amongst developers, as one person can work on CSS, another on HTML and the third on JavaScript, without developers stepping on each other’s toes all the time.

Semantic uses of <h1>, <h2> ... <h6> HTML tags

Friday, April 24th, 2009

What is it for?

<h1><h6> tags are intended to mark up various levels of headings and subheadings.

<h1> carries the biggest importance, with each heading below carrying lesser importance.

Appropriate uses

Use only one <h1> per page

<h1> tag is intended to denote the highest level heading on a given web page.

It explains what the web page is about, and since a good web page ought to have one purpose, <h1> tag should therefore be used only once on each web page to avoid confusion.

This guideline is not in the HTML specification, but has been adopted by all good web developers I have met and worked with so far.

It is also a guideline accepted by SEO experts since Google and other search engines treat the <h1> tag with high level of importance.

Use <h1> tag on any given page only once and make sure it uniquely entitles what the page is about.

Use <h2> as component level headings

Take a look at an example of how to use <h2> tags as component level headings.

<div id="news">
   <h2>Latest News</h2>
   <ul>
      <li><a href="www.example.com">News item 1</a></li>
      <li><a href="www.example.com">News item 2</a></li>
   </ul>
</div>

If you needed subheadings within the scope of this component you would follow up with <h3> subheadings. For example:

<div id="news">
   <h2>Latest News</h2>
   <h3>Business</h3>
   <ul>
      <li><a href="www.example.com">News item 1</a></li>
      <li><a href="www.example.com">News item 2</a></li>
   </ul>
   <h3>Science</h3>
   <ul>
      <li><a href="www.example.com">News item 1</a></li>
      <li><a href="www.example.com">News item 2</a></li>
   </ul>
   <h3>Education</h3>
   <ul>
      <li><a href="www.example.com">News item 1</a></li>
      <li><a href="www.example.com">News item 2</a></li>
   </ul>
</div>

Inappropriate uses

Heading levels are seemingly simple tags to utilise within web pages, but even they have been somewhat abused by developers and tools which generate them.

Thinking about them in context of many different systems also makes the issue somewhat more complex and some lateral thinking is required to code them up most appropriately.

<h1> around the logo or company name

<h1> should not be used around the logo or the company name.

The simplest way to explain the reasons behind this is to take an example of BBC web site, which, if it had <h1> wrapped around it’s logo would have couple of million pages with ‘BBC’ as the main title on each page.

It is obvious from this example that it is not semantic to wrap an <h1> around the logo or company name on any web site, small or big.

Doing so decreases semantics of such web pages and is therefore bad practice.

It is also arguable that if there is no ‘cadidate content’ for an <h1> on a given page, then the Information Architecture of the given page is wrong and needs to be re-thought.

Essentially, good design should include a clean title and purpose for existence of each page and a developer should wrap that main title into <h1> level heading.

The only place where <h1> makes sense to be used around a logo is on the very homepage, just like it is the case with BBC web site.

The homepage represents an overview of what the organisation offers, hence on the homepage the main of ‘British Broadcasting Corporation’ makes sense for the BBC web site.

<h1> as heading for components

Once again, there should be only one <h1> heading on each web page.

Some developers (usually the ones from design backgrounds) have misinterpreted the purpose of <h1> and have decided to go along with the development practice of using an <h1> heading within each component.

This usually results in several <h1> tags being used across any given page, with <h2> tags and lower level tags getting either very little or no prominence.

<h2> tags are much more appropriate for component level headings and should be used for that purpose.

If a component is taken out of context from one page and placed into another page, the <h2> heading of that component will not confuse with the semantics of <h1> tag on the new page.

Repeating <h1> many times on one page

This should never be used as it dilutes the overall meaning (semantics) of the page and confuses screen reader users.

Misusing headings for visual display purposes

This misuse of heading tags is usually observed with amateur developers who do not understand semantics.

Under no circumstances should you use a given HTML tag just because it renders in a certain way in the browser.

You should always use CSS for look and feel and disregard the default rendering within the browser to achieve desired design effects.

Only pay attention to what the meaning (i.e. semantics) of the tags actually are.

Muddy areas

Should <h1> contain the same content as <title> tag?

Most SEO experts think so and it seems to make sense, but also throws up a question about purpose of having both <title> and <h1> tags if they are going to contain exactly the same content?

This is the reason why proposed <h> tag in XHTML2.0 makes much sense, as it makes the ‘level’ of a given heading less contextually sensitive.

It also allows page level as well as component level headings to be more ubiquitous and reusable from system to system, without developers having to change the code around to make every interface as semantic as possible when reusing the code.

What’s the purpose of low level headings?

It is also the case that tags <h4>, <h5> and <h6> are an overkill for most every day web sites, but can be too limiting for chunky, long, detailed PhD type dissertations which may require many more heading levels.

This is where it could be argued that HTML specification falls short, making the XHTML2.0 <h> proposal very viable in this respect.

Once again, using <h> would solve this problem enabling any number of nested headings, but I wonder how we would be able to style these in a meaningful manner using current CSS implementation and specification.

Semantic uses of <a> HTML tag

Wednesday, April 15th, 2009

What is it for?

<a> tag in HTML stands for ‘anchor’ and is used for marking up hyper links to other web resources.

Links can reference another HTML page or point to a particular section of the current web page through use of id or name attributes.

Muddy areas

href attribute with in the <a> tag specifies the location of a web resource and (according to the HTML4.01 specification) can be left empty, so that a script of some sort can add this value.

The exact quote is:

Authors may also create an <a> element that specifies no anchors, i.e., that doesn’t specify href, name, or id. Values for these attributes may be set at a later time through scripts.

This is an interesting aspect of the specification as it leaves room for many different functional requirements which are often present in Web2.0 type interfaces which rely heavily on JavaScript.

For accessibility purposes though, this aspect of the specification does not make much sense, as user interfaces ultimately ought to work without presence of any scripting whatsoever and this would leave anchors without an href useless and inaccessible.

I would therefore discourage developers from using empty anchors in their web sites.

Content as code

There were number of discussions on the web about the most appropriate ways of linking to web pages.

There are numerous usability guidelines on how to best write content to be displayed within links.

There are also those developers who believe that each link should have a title attribute present within it, which is wrong, as title attribute only ought to be used on links which do not have descriptive enough content within them.

However, if you follow usability guidelines and only use link text with descriptive content, then you should not need to use a title attribute along with your <a> tags.

Having written this, within Web2.0 context, it has become a de facto standard to use title attribute within a list of links containing a gallery of images, where title contains a caption for each main image to be displayed.

I don’t see anything particularly wrong with this approach and would therefore encourage it.

Link content is almost code within the reams of semantic web, as search engines like Google have been built upon following links and using their content in order to rank various web sites (of course this is only one of many aspects which Google takes into account and search engine optimisation deserves another whole book’s worth of explanations).

Jakob Nielsen has also come up with interesting research from usability perspective on how the first 11 characters of the anchor text are enough for users to work out where the given link is going to lead to (see: First 2 Words: A Signal for the Scanning Eye).

All of this is part of good, semantic coding practices.

Further considerations

With emergence of micro blogging, services such as TinyURL have become incredibly popular, creating shortened versions of a link to longer versions of the (original) link.

Best practice for dealing with these types of links is to use the rel attribute and give it a canoncial value.

For example:

<a href="http://tinyurl.com/d9oxbs" rel="canonical">http://tinyurl.com/d9oxbs</a>

It is also interesting to observe that the above outlined practices are not usually followed in these circumstances due to various system restrictions, hence canoncial is required in other to signify that the shortened link’s only purpose is to save some valuable space on a typical Twitter message for example.

This attribute can also be used in order to denote those web resources which have a different URL, but underneath are acutally the same resource.

Further reading

Official Google Webmaster Central Blog: Specify your canonical

Semantic uses of <p> HTML tag

Monday, April 13th, 2009

What is it for?

<p> tag is a very nice, semantic tag simply intended for marking up paragraphs of text and that is it’s only purpose.

Appropriate uses

In a code block like this:

<p>This is an article about paragraph tag in HTML.</p>
<p>When we are marking up paragraphs in HTML we should use it for actual paragraphs of text and not for anything else. Paragraphs are only intended to be used for this purpose.</p>

we currently have two equally weighing paragraphs of text in terms of semantics.

However, as an author of this text, I actually intended first paragaph to be a description of the second paragraph.

Semantically speaking, changing the above code to:

<p class="description">This is an article about paragraph tag in HTML.</p>
<p>When we are marking up paragraphs in HTML we should use it for actual paragraphs of text and not for anything else. Paragraphs are only intended to be used for this purpose.</p>

adds a layer of additional semantics to the code giving the first paragraph a little more meaning and implying that it is a description of some sort.

The reason why I used a class instead of an id is because I could have many different descriptions on one page in any given system.

Inappropriate uses

Empty line break

Wrapping text into a <p> tag creates a natural line break after the text and some developers are often tempted to use a <p> tag to wrap a non-paragraph text into it or even use a blank <p> for the purposes of creating a line break.

Using paragraphs instead of a list

Sometimes you will come across a list of some sort being implemented as a series of paragraphs.

Consider the following code block:

<p>Semantics are good.</p>
<p>Semantics help the world.</p>
<p>Semantics are good for Internet.</p>
<p>Semantics create better solutions.</p>

Each one of the points made in this list is not really a paragraph.

A paragraph is usually a longer piece of text, while here we are looking at a list of short points about semantics, which belong to a same ‘group’ of thoughts or points.

The above list of points would therefore much more apropriately be marked up as a <ul> block, an unordered list of points about semantics.