Archive for the ‘Intermediate’ Category

Why are most web sites so bad?

Monday, July 6th, 2009

Overview

In the past you might have heard something like this being uttered: ‘Web is a very young industry compared to many others, so it’s only natural to expect many things not to work very well or at all!’

There is much sense in this statement, but many aspects of it are incredibly flawed too!

Software Engineering as a practice has been around for a relatively long period of time (at least 20 years) and many books have been written on the subject matter.

I studied software engineering formally some 10 years ago and have learned common sense matters such as that software should be developed according to user needs from a very early point.

You will often get an impression that ‘User Centered Design’ is somehow a ‘new approach’, which has not yet been well formulated and much research has not been done to formalise it.

This cannot be further from truth.

I will now try and explain some of the reasons why I think so many web sites are so bad and regularly fail to work properly.

How most industries work

Think about the medical profession.

A typical doctor will spend approximately 7 years in formal University education in order to qualify as a doctor.

Before then, in order to even start studying medicine, students will need to prove that they are smart enough and capable enough to study the subject by achieving high level grades in their previous education.

The University Degree will teach medical students various theoretical aspects of medicine, biology, chemistry and much time will be spent doing practical, lab-based work while everything is likely to be discussed to Nth degree.

After 7 years of intense formal studying and evaluation, medical students will be placed in a hospital where they will work-shadow a various senior staff in order to understand the context of working in a hospital as well as see real life medical issues being dealt with.

Similar is the case if you would like to become a Gas Certified Engineer.

You will spend at least 4 years studying in order to obtain your formal qualification so that you can tell someone whether their Gas Central Heating system is at the right pressure level.

Important aspects of the above two examples are that Gas Certification Standards and Medical Books are developed and written by a high-level authorities which are setting those standards.

How IT industry works

Unlike with the above two example professions, in IT related industries for most part there are no official qualifications required from someone so that they can work as a developer.

In fact, many people will potentially frown upon those who come from Computer Science or other technical backgrounds.

Developers are asked to ‘prove themselves’ by showing a portfolio or a list of links to sites which they worked on, often to be judged whether their work is good or bad based only on the looks of those web sites.

Many times those interviewing IT people do not know much about IT themselves.

Many contributors to IT industry are totally non-technical and never want to be technical at all.

In some cases people working in IT industry quite openly hate IT itself too, while I find it difficult to imagine a brain surgeon hating the concept of operating.

The problem of people and process

Software Engineering is an incredibly complex undertaking.

Building a blogging system (which is a relatively simple matter) can be done in a very inappropriate manner, but it can also be done in such a way as to serve many other purposes.

Every software, however simple or complex, is subject to this fact.

Building software is somewhat of a conveyor belt task, just like cars are put together in a factory.

In order for software to be built well, it requires each person in the building process to follow the process, as well as know what their tasks and responsibilities are.

In most cases, software development is never approached in this way, and this is one of the main reasons why most IT projects fail miserably.

Those who don’t know cannot setup a process

We are back to the initial issue – if someone is not trained and clued up on what they are doing, they cannot organise their work properly in order to do it properly on an ongoing basis.

It takes time, effort and experience to understand what makes a good process and a proper approach to software development.

This is also heavily related to the type of software a team is building.

Another important matter to consider is that untrained people are often unable to improve the processes they are working on, constantly being locked down in a vicious circle of struggling to make a bad approach work.

Even if they manage to deliver something using a bad approach/process, usually the delivered product does not meet even the basic quality standards.

In fact bad approach/process usually generates software which does not meet the actual business requirements and those who worked on the product are often forced to explain why something ‘could not be done’ for whatever reason.

This ‘cannot be done’ attitude eventually precipitates into the organisational culture and becomes defacto standard way of (not) building sofware.

A moving target

One more major reason why most web sites are so bad is that the IT industry is a constantly moving and evolving target.

By the time a web site has been published on-line, in most cases it is already somewhat out of date, needing improvements, maintenance and taking care of.

This is the case with every software.

In order for web sites to be kept ‘fresh’ people building them are required to keep their skills and knowledge fresh.

This requires passion and dedication, which most people do not have (enough of), producing average, slightly out of date products all of the time.

Semantic uses of <label> HTML tag

Monday, June 15th, 2009

What is it for?

W3C states quite clearly that:
The <label> element may be used to attach information to controls. Each <label> element is associated with exactly one form control.

The for attribute associates a label with another control explicitly: the value of the for attribute must be the same as the value of the id attribute of the associated control element. More than one <label> may be associated with the same control by creating multiple references via the for attribute.

Few interesting points from the above quote for developers to understand are:

  • <label> may be used which implies that strictly speaking it is not needed, although I will strongly argue that not using <label> with form elements would be very bad practice
  • One <label> can only label one form control, so it is impossible to use one <label> to label many form controls
  • Developers can use more than one <label> in order to label any given form control, which is potentially a very powerful construct

Appropriate uses

Generally speaking the most common scenario when using <label> element is the following type of association:

<label for="username">Username or email</label>
<input type="text" name="username" id="username" />

Each form field should have a label associated with it in order to achieve proper accessibility and usability.

Inappropriate use

Not using labels with form fields

From Google

<div class="section_error">
<input type="checkbox" onclick="GWO_updateContinueButton(this)" value="yes" name="ubac_accept"/>
<strong>Yes,</strong> I accept the above Terms and Conditions.
</div>

In the above code snippet Google exemplifies some of the worst practices in User Interface development.

Apart from using non-semantic class name of ‘section_error’ for a non-error content snippet, they are writing in-line JavaScript, which is harder to maintain and does not achieve separation of concerns.

Most importantly Google are not using any labels for the check box, hence they are damaging accessibility and usability of the code snippet as there is no explicit label associated with the check box to denote what the check box is for.

Considering that this code snippet finds itself in context of a very simple documentation page, there are no excuses why Google’s engineers should ever produce this bad quality of User Interface code.

A re-worked, appropriate, semantic solution for this code snippet would read something along these lines:

<div id="termsAndConditions">
<input type="checkbox" value="y" name="termsOptIn" id="termsOptIn" />
<label for="termsOptIn"><strong>Yes,</strong> I accept the above Terms and Conditions.</label>
</div>

If really required, JavaScript functionality associated with this snippet should be situated in an external JavaScript file which associates the functionality by hooking onto to the termsAndConditions ID attribute.

Progressive enhancement

Thursday, June 11th, 2009

So far I have touched on the topic of emerging semantic web structure, outlining the appropriate way in which we should layer web technologies.

This approach is often refered to as ‘progressive enhancement’.

It is the development approach designed to make sure that a web page works no matter which end user client it is accessed by.

When using progressive enhancement, you should make sure you code up your (X)HTML as semantically as possible first, then style things up with CSS and at the very end add JavaScript behaviours or AJAX enhancements.

This approach ensures that a web page is fully accessible even when JavaScript is not present or not fully supported by the client accessing the web page.

Progressive enhancement is more work?

Following progressive enhancement approach, however, can prove to be much more cumbersome and time consuming, compared to not following it.

This is especially the case when building web applications, such as GMail or Google Calendar.

Web applications can often require two parallel solutions to be engineered for circumstances where JavaScript is available and for circumstances where it is not.

This is one of the reasons why GMail, for example, has an ‘HTML only’ version available as a separate link, while the main version of GMail does not work with JavaScript switched off.

Another likely reason why GMail does not work without JavaScript is the fact that Google engineers use GWT (Google Web Toolkit written in Java) to create all their interfaces with.

GWT spews out user interface code which completely breaks the above outlined good practices, yet creates code which still works cross browser.

Since Google engineers tend to think about back end logic of applications as a matter of priority, their outlook on web application development can be described as graceful degradation.

Graceful degradation

Unlike progressive enhancement, graceful degradation approach assumes all technologies are supported and ensures that users have best possible experience when they are enabled.

Once this is achieved, the considerations are given towards circumstances where certain technologies are not supported and how to deal with them.

Graceful degradation is arguably more appropriate approach in the short term, as the main business objective is to create a working interface for the main user group as soon and as cheaply as possible, while serving the minority user group if and when possible.

This is how most of the very profitable web sites have been developed.

Hard to maintain software is dead software

In the long term, however, graceful degradation approach will more than likely mean that a given application is much harder to maintain and update, as the code is not developed according to semantic standards.

This can mean that future development and progress can only be achieved by completely re-writing a solution from scratch, which can lead to long down times, difficult migration processes, massive costs of re-development, stifled innovation and many other unwanted side-effects.

An extremely good example of this is MySpace, which is one of the worst applications ever developed, which has seen very little to no improvements in the last few years. This is more than likely due to the poor quality of the initial development and extremely bad User Interface code quality.

MySpace’s poor quality user interface code also means it is very tricky to create nice skins for MySpace profile pages using CSS, as the underpinning HTML code is very poor.

This means that MySpace’s very sales pitch of enabling users to customise their profiles however they like is flawed due to the inflexibility in the coding deployed by MySpace developers.

Using graceful degradation also creates a harder environment for developers of various skill sets to work within, as within Google setup any developer working on GMail will have to be a wizard in GWT.

In semantically developed interfaces (i.e. progressively enhanced) a company can employ a relatively cheap CSS developer to develop themes for their CMS or web application without ever needing to touch HTML or JavaScript.

In this environment the company developing the solution has the ultimate flexibility and control over their toolsets.

In progressively enhanced interfaces it is also much easier to delegate work amongst developers, as one person can work on CSS, another on HTML and the third on JavaScript, without developers stepping on each other’s toes all the time.

Biggest challenges for creating semantic web

Monday, April 13th, 2009

(Non-)adoption of one universal standard

One of the main problems with semantic web is the fact that standards are not developed enough.

We currently have a problem of next generation HTML being either XHTML2.0 or HTML5 (or both!), which would lead to a more technologically disparate user interface implementations across the web.

In very short period of time from now we may have interfaces written in HTML4.01, HTML5, XHTML1.0 (Loose, Transitional and Strict) and XHTML2.0.

This is already enough ‘stew’ of code, which is going to be very difficult for browsers to render.

It’s a common case today to be coding up a contact page, written in standards compliant XHTML1.0 Strict, which needs to include a Google Map, which is written in non-standards compliant HTML4.01.

The end result is a mish-mash solution which really makes no sense in terms of standards and long term validity.

Machine readability

The aim of the systems is to automate the mundane tasks.

This can only be achieved by using data structures which are easily machine readable and which developers can easily utilise to make them machine readable.

Automation of mundane tasks challenge is no longer about generating reports from a database once a month, but working out complex connections between people, phrases, terms, web pages and so on.

The systems which automate these tasks successfully (like Google and FaceBook) usually end up being very successful, highly used and profitable.

They are also incredibly scalable and very useful for the end user – imagine for example what the whole web experience would be like if Google did not exist.

Human readability

Human readibility is equally important as standards really ought to be usable by every day, non-technical individuals.

This is one of the main reasons why HTML has become so incredibly popular and therefore the whole concept of Internet has taken off to such a great extent.

With emergence of many, different standards this readability and learnability of implemented user interface solutions becomes much smaller and therefore it essentially stifles further development of Internet.

Emerging versions of technologies and standards are increasingly running into problems of human non-readibility and lower usability with each iteration.

More standards exist, the less integrated semantic web becomes and more there is for developers to learn.

If an average developer needs to know for example: 3 CSSs, PHP, XHTML, HTML5, MySQL, CSS and JavaScript with JQuery AJAX in order to implement a modern User Interface, it is clear and obvious that there are going to be fewer peoplle developing innovative solutions compared to when there was only HTML and CSS to worry about.

Standards authorities

There are a number of problems regarding the standards authorities on the web.

Generally speaking, standards authorities fall into the following groups:

  1. Socialists: Human usability proponents who are usually not highly technical or not technical at all
  2. Developers: Machine readbility proponents who are usually very technical, but have little or no considerations for usability and user interaction needs and requirements
  3. Corporations: Large corporations, who inevitably want to create ‘standards’ which only benefit their bottom lines (i.e. Apple, Google, Yahoo, Microsoft, IBM, etc.)

Each one of these groups have an important role to play in development of future and current standards, but in practice they do not collaborate enough, hence the standards tend to edge towards the benefit of one of those groups only and not everyone.

The above groups have the following impact on standards development:

  1. Socialists: Promote best end user experience and most collaborative and easy to use solutions
  2. Developers: Promote easiest and most flexible technologies to work with which offer viable solutions to solving complex problems of all sorts
  3. Corporations: Impose standars by deploying them across their massive platforms (i.e. FaceBook’s 200M users is a good start towards deploying any standard on the web)

The only way proper standards can be achieved is by all groups collaborating together constantly in order to achieve standards, while in practice this does not happen nowhere near as enough as it should be.

The end result is more ‘broken’ than ‘fixed’ web, with group of people developing for different platforms, depending on what they have decided to support and most of the time never really supporting everything that should be supported.

Pareto analysis of CSS

Friday, April 3rd, 2009

Highlighted below are the most widely used CSS declarations.

These are used mostly because most of them are supported in all browsers, they are quickest and shortest to write and make most sense in terms of design of usable and accessible interfaces.

Once again, Pareto’s analysis exemplifies itself in a similar way to how it exemplified itself with HMTL (see Pareto analysis of HTML) which implies that CSS implementation is possibly more extensive than it needs to be for most web site types.

  1. :active
  2. :after
  3. :before
  4. :first-child
  5. :first-letter
  6. :first-line
  7. :focus
  8. :hover
  9. :lang
  10. :link
  11. :visited
  12. background
  13. background-attachment
  14. background-color
  15. background-image
  16. background-position
  17. background-repeat
  18. background-repeat
  19. border
  20. border-bottom
  21. border-bottom-color
  22. border-bottom-style
  23. border-bottom-width
  24. border-collapse
  25. border-color
  26. border-left
  27. border-left-color
  28. border-left-style
  29. border-left-width
  30. border-right
  31. border-right-color
  32. border-right-style
  33. border-right-width
  34. border-spacing
  35. border-style
  36. border-top
  37. border-top-color
  38. border-top-style
  39. border-top-width
  40. border-width
  41. bottom
  42. caption-side
  43. clip
  44. color
  45. counter-increment
  46. counter-reset
  47. cursor
  48. direction
  49. display
  50. empty-cells
  51. float
  52. font
  53. font-family
  54. font-size
  55. font-size-adjust
  56. font-stretch
  57. font-style
  58. font-variant
  59. font-weight
  60. left
  61. letter-spacing
  62. line-height
  63. list-style
  64. list-style-image
  65. list-style-position
  66. list-style-type
  67. margin
  68. margin-bottom
  69. margin-left
  70. margin-right
  71. margin-top
  72. max-height
  73. max-width
  74. min-height
  75. min-width
  76. outline
  77. outline-color
  78. outline-style
  79. outline-width
  80. overflow
  81. padding
  82. padding-bottom
  83. padding-left
  84. padding-right
  85. padding-top
  86. position
  87. right
  88. table-layout
  89. text-align
  90. text-decoration
  91. text-indent
  92. text-transform
  93. top
  94. visibility
  95. white-space
  96. word-spacing
  97. z-index

The use of above highlighted declarations is most common based on the projects I have worked on so far (small and large) and various analysis I have done on other web sites in the wild.

It does not mean that you are likely to use all of these declarations on all web sites.

It is also the case that some declarations are used for, essentially, wrong purposes. A great example is the use of line-height in order to position elements vertically, which, over time, can end up being counter-productive approach on larger web sites.

Nevertheless, line-height is a worth while declaration to utilise, for example, in order to make text easier to read on the screen as it is generally agreed that some extra spacing between the lines of text makes it easier to read off the screen.

Who builds software?

Tuesday, March 31st, 2009

This is perhaps a strange title, which could have a simple answer: developers and/or engineers.

In reality, however, modern software solutions work in much more sophisticated and complicated ways, even though they may not be apparent or present on most web solutions as yet.

The overall trend in software engineering is heading towards user generated content, user generated layouts and even user generated functionality.

It is also true that software is built by some or all of the following too:

  • Business stake holders
  • Tools used by engineers to build software (see ‘Overview of Code Editors‘)
  • Software architects
  • Automated back end systems which spew out content (and code), etc.

All this is fine, but the ultimate aim of the web should be towards re-usability of everything that is created, by whatever means it was created.

All these methods of content and code generation are potential friends, but in most cases fierce enemies of semantic web.

Problems in user generated code and content

Here are some examples where semantic web solutions are jeopardized:

  • A back end engineer rarely produces a good piece of User Interface code
  • Microsoft Front Page will never create a good piece of User Interface code
  • A software architect may produce a prototype which is only semi-semantic, but ends up being used as-is in the final solution due to (perceived) time and money constraints on a given project
  • Automated back-end systems spew out content which, at the time of development, may have been considered great, but today is seen as semantically invalid
  • There are many examples where code and content are mixed together and virtually inseparable and does not conform to Emerging Semantic Web Structure

Many Java libraries for AJAX produce inline JavaScript code which works in a browser or two, but would fail every single semantic test thrown at it.

Why? Because it was produced by a Java back end developer, who likely never heard about unobtrusive JavaScript and will likely not hear about it for a while as ‘it’s outside of his area of expertise‘.

Internet is moving towards the age of user generated interfaces, while in many cases the basics have not yet been put into proper practice.

Often a time code is also generated directly by administrator users of systems, writing ‘content’ through a WYSIWYG interface.

I have witnessed those users put a whole web pages into sections of another page, because that’s the only way they knew how to copy content from one web page into a CMS.

Needless to say this practice creates a whole bunch of junk web pages which are not fit for any purposes whatsoever, let alone being semantic in nature and reusable.

Our aim as web professionals should be to create semantic solutions at all levels, no matter how content and code for those solutions is produced.

End user generated code and content will pose even bigger challenges to engineers, developers and architects who will really need to think hard how to create solutions which are truly ubiquitous.

Whatever or whoever ends up building your software, it or they need to be able to produce a reusable, semantic solution.