Semantic Web 1: Semantics – what is an ontology?

To a computer, the Web is a flat, boring world, devoid of meaning. This is a pity, as in fact documents on the Web describe real objects and imaginary concepts, and give particular relationships between them. For example, a document might describe a person. The title document to a house describes a house and also the ownership relation with a person. Adding semantics to the Web involves two things: allowing documents which have information in machine-readable forms, and allowing links to be created with relationship values. Only when we have this extra level of semantics will we be able to use computer power to help us exploit the information to a greater extent than our own reading.

– Tim Berners-Lee “W3 future directions” keynote, 1st World Wide Web Conference Geneva, May 1994

hen we speak of web 3.0 and the semantic web we focus on computer processing/ understanding of web content.  Currently web sites are ‘marked up’ to make them easier for us as readers of the site to follow them.  Using HTML certain text is marked as a ‘header’, certain text is marked as ‘bold‘, as indented, etc.  All of this facilitates us, as humans, in reading and following/ understanding the content.  But, more importantly, we understand much of the content based on our own knowledge, the context of each phrase/ sentence, etc.

So, how much of this data on the web could be processed (‘understood’) by computers, analysed and presented back to us as humans in a useful format e.g. categorised, annotated, summarised, ranked, etc?  Broadly there are two possible ways forward: software which can figure out what the content is about (Natural Language Processing etc.) or some additional ‘marking-up’ of the content – to flag what specific terms/ words/ phrases mean.

Natural Language Processing is a major area – huge rseearch completed and ongoing, major advances made over the years.

On the mark up front there have also been significant advances and product offerings.

One core element in all of this computer processing/’understanding’ is agreement of the meaning of terms/ concepts – hence the use of the phrase ‘semantics’.  We are all familiar with the phrase often used in trying to resolve/ advance arguments: ‘it’s a question of semantics’.  Generally the intent of the phrase is to say that the antagonist and protagonist agree conceptually but that much of the disagreement is accounted for by misunderstanding/ different understanding of the terms being used by either party.

Dealing with concepts, their relationships and meanings is addressed using ONTOLOGIES.  The semantic web has given rise to a whole field in the development, publication and maintenance of ontologies.  Rather than trying to explain ‘ontologies’ in detail here I think this short video – focused on introducing ‘biomedical ontologies’ – does a great job of explaining the concept of and use for ontologies.

Ontologies and the challenge to IT leaders

The recent Technology Forecast publication from PwC focused on semantic web and linked data.  Interesting series of articles – and I like the concept of dealing with ‘messy data’.  CEOs and other managers want to be able to merge internal ERP type data with external data.  Also reminds readers that the I in CIO is for information – and that CIOs need to take the lead on the generation and planning of relevant ontologies – given a clear understanding of their businesses and a working knowledge of ontologies.

On the same theme interesting piece by Linda Moulton – the line ‘enterprises must commit to having very smart people with enterprise expertise to build the ontology’  rings the same bell.

Linda Moulton believs that real progress in adoption of semantic web will be seen first within entertprises, later between enterprises and across the web more generally.  Seems to make a lot of sense and be the most likely scenario – however we may prefer the more holistic soltuion to emrge immediately.

Microsoft and the semantic web

Microsoft and Creative Commons announced last week the release of the Ontology Add-in for Microsoft Office Word 2007 that will enable authors to easily add scientific hyperlinks as semantic annotations, drawn from ontologies, to their documents and research papers.