Scope for semantic expression

It has been a subject of many discussions I have had with various average developers that current technologies do not allow enough room for ’semantic expression’.

People will complain that we do not have enough HTML tags in order to mark up everything that we need to build on web sites we work on.

People propose new ideas in specifications such as HTML, which are narrowed down too specifically onto web site types (see Web site types), forgetting that the web was built upon much more generic underpinning standards and principles.

Previously I wrote about Pareto Analysis of HTML and Pareto Analysis of CSS, both of which illustrated the fact that most web site types only require around 20% of the available technologies in order to be successfully implemented.

If HTML specifications were insufficient for proper semantic expression, we would be using 100% of HTML tags all the time and would be struggling to express the rest due to short falling of the specification.

In reality, we are often not using, or in worse cases abusing, the specification and standards and therefore are unable to semantically express ourselves properly.

Here are some of the building blocks which, within the mainstream web technologies, allow us to express the meaning of information we are marking up on regular basis.

Meta data

Meta data is one of the least utilised aspects of any web page and one of the most misunderstood ones by everyone, including developers, users and search engines (who have arguably killed meta data purpose even more by surrendering to the fact that developer bad practices were the necessary fact of life).

The purpose of meta data is to describe the purpose and meaning of a given web page as a whole.

If the search engines like Google developed their page rank algorithm in such a way as to reward those systems which have good and properly implemented meta data standards, the web would be of much higher quality.

Meta data implementation is simple and straight forward, as well as supported by good CMS solutions.

With further extensions to meta data standards (see Dublin Core) it would be possible to implement any document exchange mechanism across various web systems, which would be enormously powerful leverage for semantic web.

CSS class names

Class names are also misunderstood by developers.

One of the main cornerstones of Object Oriented development is the notion that a class name becomes a living part of the system.

A reusable piece of code named ’str2int’ for example, although short and concise isn’t necessarily easy to understand, remember and work with if it becomes widely accepted and reused.

Similarly naming CSS classes as ‘block’, ‘left’, ‘morePadding’ or ‘left_margin’ does not achieve longer term acceptability of name and does not enhance the overall understandability of the solution.

Class names like ‘product’, ‘hCard’, ‘organisation’ or ‘component’ are much more useful and can be made part of any system and reused over and over just as though they were part of the official HTML specification.

This is in line with general best practices of Object Oriented Software Development and makes both machine and human sense, for short, medium and long term.

HTML tags

HTML tags themselves are highly semantic and serve a very good purpose for web documents.

Developers would often abuse the tags, using paragraphs for layout, failing to mark up all lists properly, using <fieldset> for the purposes which <div>s are intended for and so on.

These abuses of specification inevitably lead towards bad semantics, lack of standards compliance and ‘lack of tags’ perception, even though most problems are more than resolvable through utilising proper coding principles.

HTML structure

Sometimes it is, by no means, enough to use one or two HTML tags in order to mark up a more complex piece of data.

In many instances we have situations where a paragraph of text contains a few acronyms which needs to be additionally marked up.

This is what HTML is all about, allowing us to combine atomic level tags into complex data structures.

Creating an XHTML page is essentially no different to creating an XML data structure.

It ought to be completely reusable at all levels and make as much sense as possible in and out of context.

This is the ultimate goal of excellent information architecture at technical and non-technical levels.

URL structure

Often very overlooked aspect of semantic expression is the very notion of the URL structure.

A link like www.flexewebs.com/p.php?id=12 does not mean anything, while www.flexewebs.com/about-us makes much more sense to humans and computers.

URL structures could be heavily standardised (see Cool URIs for the Semantic Web) and could help machines and humans navigate the web easier, faster and better.

URL structures are also incredibly powerful means of creating the overall web system architecture, which Flickr team utilised in order to create their system – they started the overall design from the URL designs.

XML

XHTML is a subset of XML and that’s the main reason why XHTML should be preferred technology choice over XHTML.

XHTML provides systems with a practical ability to offer ‘user interfaces as APIs’, where any system should be able to harvest data from a well constructed XHTML web system and reuse that information elsewhere if needed.

Knowing and understanding best practices in XML is therefore absolutely critical to creating great quality , scalable, semantic web user interfaces, which retain their value over a long period of time.

Content

Last but not least, the very content of the given widget is the cream on the cake of semantic expression.

After we have marked up our widget with all the necessary structural markup, the content which is contains gives the widget the final meaning and purpose.

Writing good and relevant content for web should therefore also be considered as an important part of building semantic web user interfaces.

The content ought make as much sense when taken out of context as well as the code and achieving that gives the final solution the real power.

Written by Jason Grant, BSc, MSc on 6th April 2009

2 Responses - Join the conversation

  1. My class name are not that beautiful yet. I have to dive deeper to the codes. I enjoy your writing here, Jason. You have read a lot of resources, I think. Thank you. :)

    Will you open some web/blog consultation or review here?

    dani on 21st April
  2. Dani,
    Thanks for your kind comments.
    You are welcome to help me out here by:

    Reviewing the posts
    Contributing with your own content
    Promoting this blog to others
    Suggesting what you want me to write about
    Any other ways you see fit

    Jason Grant on 21st April

Contribute your expertise