Why publish your data?

Attended the ‘Opening Up Government Data’ day in DERI, in Galway, today.  Some interesting presentations and demonstrations of tools such as Google Public Data Explorer and Simile Widgets – to enable you to work with linked open data.

In one of our breakout sessions we discussed why companies would choose to share their data – ie publish it in formats which are easy for others to consume (e.g. csv, not pdf).  Granted many businesses have web sites, describing their offerings, providing some background on the company, potentially inviting comment or queries and answering some Frequently Asked Questions.  But very few offer much data arising from their product or market research, their production statistics or their sales campaigns.  In general they would regard this information as confidential and constituting some element of their ‘Competitive advantage’.

Today’s seminar really focused on government publishing data which it might be argued belongs to the citizens.  However, understandably, there was plenty of discussion around this in terms of the efforts required to publish it, the potential ownership of the data, maintaining the data going forward, etc.  There was some discussion as to whether ther should be a charge for this data on the basis of the costs associated and the potential for companies to generate come commercial benefit.  Plenty of solid reasons were put forward for publishing the data – transparency, accountability , etc.

Much of this debate brought me back to thinking about privacy, Mark Zuckerburg’s general approach with Facebook, differing attitudes to publishing personal information on sites such as Facebook, twitter, google+, foursquare, etc.  Why do some people choose to share their views on politics, on the economy, their location in a restaurant – whereas others want nothing shared?

I think this question ‘Why publish your data?’ can be addressed in all of these contexts – individuals, government and corporates.  And the answer is  – because the person or the organisation sees some value in its publication.  At the personal level the social networks and smart phones have made publishing data so much easier.  What we are now seeing emerge for business and government are a range of platforms and tools which may all of this a lot easier for government and companies.  And a little like individuals – not sure that anyone has really worked out where all of this is taking us.

Some newspapers have figured it out – by making their content available, by marking it up semantically, they become more relevant to more entities for longer.  But that’s a little different to publishing data which is a product of lots of research completed at your own cost – on the basis that it’s good for society or that of I do it then someone else will publish other data in exchange which I can exploit.


Tackling human intelligence

I was drawn to the semantic web and semantic technologies because of the potential benefit to each of us.  There is no debate about the growing volumes of data – be that in our personal digitally recorded lives, our business lives or more generally. on the World Wide Web.  So tools/ solutions which assist in processing/analysing or making sense of some of this data seem attractive to me. Part of the challenge is trying to have software do some of the heavy lifting.  Much of the data which is potentially subject to heavy lifting has originally been published for human consumption and is not ideally formatted for consumption by software.

So semantics has its place.  Can we deal with the ambiguity in the data?  In Australia a reference to football may mean ‘Australian Rules’ football, in England may mean ‘soccer’,  in Ireland my mean ‘Gaelic football’.  So if I have a piece of software doing some heavy lifting across the web to analyse performances of ‘football full backs’ during on the weekend of the third month in December 2009 my software may be confused – may mix up different codes, etc.  I may be able to define my search/query in great detail but perhaps the data as originally published does not provide the required clarity – risking ‘a question of semantics’.

I was quite taken by the piece ‘Paul Allen: the singularity is not near’ published this week in MIT’s Technology Review.  Ray Kurzweil’s thoughts on computer systems bypassing human intelligence in the near future are well known and documented.   Paul Allen and Mark Greaves argue strongly that Kurzweil is being over optimistic (depending on your viewpoint).  They include a number of examples from neuroscience and artificial intelligence arguing that we will be a long way sort of Kurzweil’s vision in 2045 – Kurzweil’s date.

Much of this took me back to the simplicity of what we are trying to achieve in semantics/ semantic web – the heavy lifting.  And it’s not proving very simple.  Yes, the search engines and various semantic tools are presenting improved, cross referenced, even multi-correlated data – but we have an awfully long way to go.




do we need to remember anything?

Recently read ‘Moonwalking wih Einstein’ – good piece on training and testing memory. Attending Semtech UK in London this week. Excellent presentations by Madi Weland Solomon of Pearson and Jem Paul Reyfield of BBC. With proper use of these semantic solutions do we really need to remember anything?

Are you distracted when watching television?

In this era of smartphones and tablets even when we do watch the traditional television we tend to have another device on the go.  at the most basic level people continue to text while watching television. Many will be engaged in social networking while watching television e.g. commenting on Twitter or Facebook while watching a movie or a sports event.  Also if online you access to sites which may provide more information about the event e.g. players statistics, match statistics, actor profiles, etc.  and, of course, in many cases you will actually be watching the TV programme on a device such as a smartphone or a notepad.

Interesting piece here about potential use of semantic tags/links to improve the tv experience in this environment.  The general observation of the negative impact of traditional TV advertising is very interesting – given that there may be more effective ways to advertise through use of metadata/ semantic tags.

…and of course the usual challenge – lack of agreed standards for all of this.



Reflecting on 2010 – in Dublin, Ireland

Reflecting on 2101 – real challenges for Ireland, some interesting technologies, the need for creative genius

Dublin by night
Image via Wikipedia

It’s been a pretty frightening year on the economic front, here in Dublin, Ireland.  Finally, despite all the protestations of the Government the EU and IMF rode into town.  A deal has been done – premised on significant growth it might be doable…if the growth does not materialise – then eventually some debt will have to be written off.

On the technology front – for me personally the smartphone wins out (currently favouring the Android platform): greater access and availability wherever you are (wherever I am).  Seems to me the Cloud has matured into something that is not going away – in fact that looks like it will win out.  I think the objections will be addressed and moved aside. On the semantic web front – lots of activity from various providers of tools/ solutions using semantic technology. Disappointing, given the presence of DERI in Ireland, that we do not see more publicity/ traction within our own smart economy.  And we trail other countries dismally on initiatives to push publication of data (using linked open data standards)  by government departments.

Snow in the suburbs
A whole new world

The last few weeks have been challenging on the weather front – in particular on the East Coast.  It would have to be said that our local government/admin/ transport has failed miserably and consistently in addressing the weather challenges.  To see major roads not being cleared each night is pretty depressing – be it shortage of money to pay the overtime, trucks to clear the snow/slush,salt to treat the roads or poor planning/management and execution.  But there is a real cost – most likely including loss of life – because of this repeated failure.

Katie Taylor, Graeme McDowell, Tipperary hurlers, U23 cross country runners and many more – great memories and inspiration in a difficult year and looking forward to challenging years.

There was my short break with my wife in Budapest – what a marvellous city and such hospitable people.  But then we had the fun courtesy of Volcanic Ash – our four day trip home was quite luxurious by comparison with the hardship experienced by others.

Best book I read was the 10th anniversary edition of The Cluetrain Manifesto.  Also often found myself returning to ideas from The Power of Pull.

And Wikileaks has caught the imagination as the year closes out.  I was not very positively disposed to Mr Assange when this began – but the overreaction from certain quarters is not doing much to reinforce my doubts.  I think we all need to reflect a little on this. Some of the ideas referenced by Clay Shirky in Here Comes Everybody and by Don Tapscott in Macrwikinomics are playing out in front of us.

All in all looking forward to the break – a chance to enjoy some of the best things in Ireland – company, craic, ceol, food, literature, scenery, catching up with the visiting diaspora…and time to do some dreaming.  Because we all need to use our imaginations and our creativity in order to ensure that we do beat our targets next year – be that winning a major, winning a football championship, keeping a job, hiring a new employee, starting a new business, teaching a student, helping someone.

Enhanced by Zemanta

Another voice for semantics

semantics have a key role to play in facilitating conversations on the internet

The Cluetrain Manifesto
Image by Gauravonomics via Flickr

Just been reading the 10th Anniversary edition of The Cluetrain Manifesto.  In his Chapter ‘but how does it taste?’ Rick Levine focuses on the changes in Participation – through blogging, social networks and participation in ecommerce sites (customer reviews etc).  However he references the walls between his Linkedin, Facebook and Phone universes.  I like his demand: ‘We need to be more fanatical in our elimination of conversational friction’.

This very much speaks to the Cluetrain Manifesto – that the Internet is all about conversations.  And effectively Levine is making the point that semantics has a role to play in facilitating this.

Enhanced by Zemanta

How we, the public, can help with linked open data

Promote, persuade, reward open data initiatives by government

Tom Steinberg
Image by pdcawley via Flickr

Excellent piece by Tom Steinberg pointing out what we the potential consumers of data can do to encourage government to provide the data.  One of his key messages actually covers off the wikileaks type risks – that when we do see any government body about to release anything which may undermine privacy we should draw it to their attention.

Have some concerns that some of what I have seen in Ireland on this subject is effectively encouraging government departments to release data so that we can ‘bash’ them.  This is completely pointless.

I think the real point is that there are masses of potentially useful data – which cannot be exploited while buried in archives or in pdf files.  We have not even begun to imagine the value of some of this data – when cross linked, correlated with all sorts of other data.

Thanks for taking the time to put the piece together, Tom.

Enhanced by Zemanta

Peace time/ war time – need to tackle lots of data

How to process the ever increasing volumes of data – in peace time and war time

Interesting piece in the Economist, under Artificial Intelligence, dealing with different ways of processing increasing amounts of data in war time or disaster situations such as earth quakes.  Author reminds us of the sheer volume and depth of information being gathered through sensors – and the requirement to process this using technology (because of the volumes).  This data may be through the use drones etc in a military situation or through crowd sourcing in a disaster situation.

Depending on one’s perspective this may be seen as a positive or yet another examples of the surveillance society.

Enhanced by Zemanta

newspapers using semantic web to be more relevant, more useful

Linked open data from The Guardian

Logo of the British newspaper The Guardian
Image via Wikipedia

This is a good story re developments at The Guardian newspaper in terms of using semantics – can only increase the relevance of the Guardian to a wider group of people – and increase (widely) referencing of journalism prodcued by The Guardian.  As a group they are also making their contribution to the linked open data movement.

Enhanced by Zemanta