Why publish your data?

Attended the ‘Opening Up Government Data’ day in DERI, in Galway, today.  Some interesting presentations and demonstrations of tools such as Google Public Data Explorer and Simile Widgets – to enable you to work with linked open data.

In one of our breakout sessions we discussed why companies would choose to share their data – ie publish it in formats which are easy for others to consume (e.g. csv, not pdf).  Granted many businesses have web sites, describing their offerings, providing some background on the company, potentially inviting comment or queries and answering some Frequently Asked Questions.  But very few offer much data arising from their product or market research, their production statistics or their sales campaigns.  In general they would regard this information as confidential and constituting some element of their ‘Competitive advantage’.

Today’s seminar really focused on government publishing data which it might be argued belongs to the citizens.  However, understandably, there was plenty of discussion around this in terms of the efforts required to publish it, the potential ownership of the data, maintaining the data going forward, etc.  There was some discussion as to whether ther should be a charge for this data on the basis of the costs associated and the potential for companies to generate come commercial benefit.  Plenty of solid reasons were put forward for publishing the data – transparency, accountability , etc.

Much of this debate brought me back to thinking about privacy, Mark Zuckerburg’s general approach with Facebook, differing attitudes to publishing personal information on sites such as Facebook, twitter, google+, foursquare, etc.  Why do some people choose to share their views on politics, on the economy, their location in a restaurant – whereas others want nothing shared?

I think this question ‘Why publish your data?’ can be addressed in all of these contexts – individuals, government and corporates.  And the answer is  – because the person or the organisation sees some value in its publication.  At the personal level the social networks and smart phones have made publishing data so much easier.  What we are now seeing emerge for business and government are a range of platforms and tools which may all of this a lot easier for government and companies.  And a little like individuals – not sure that anyone has really worked out where all of this is taking us.

Some newspapers have figured it out – by making their content available, by marking it up semantically, they become more relevant to more entities for longer.  But that’s a little different to publishing data which is a product of lots of research completed at your own cost – on the basis that it’s good for society or that of I do it then someone else will publish other data in exchange which I can exploit.

 

How we, the public, can help with linked open data

Tom Steinberg
Image by pdcawley via Flickr

Excellent piece by Tom Steinberg pointing out what we the potential consumers of data can do to encourage government to provide the data.  One of his key messages actually covers off the wikileaks type risks – that when we do see any government body about to release anything which may undermine privacy we should draw it to their attention.

Have some concerns that some of what I have seen in Ireland on this subject is effectively encouraging government departments to release data so that we can ‘bash’ them.  This is completely pointless.

I think the real point is that there are masses of potentially useful data – which cannot be exploited while buried in archives or in pdf files.  We have not even begun to imagine the value of some of this data – when cross linked, correlated with all sorts of other data.

Thanks for taking the time to put the piece together, Tom.

Enhanced by Zemanta

What does Wikileaks mean for open data initiatives?

Logo used by Wikileaks
Image via Wikipedia

The most recent Wikileaks of approx. 260,000 documents has the focus of governments across the globe.  There is much gnashing of teeth – along the lines of ‘see what happens when you share data’.  And there are many calls for less sharing of information.

This is a matter of national (and international) security when sensitive, confidential information, never intended for public consumption, is leaked.  While some of the tit bits will be of interest to the general public the more serious issues arise where national security or the security of individuals is put at risk.

Has this anything to do with the move towards encouraging governments and/or corporates to publish more data in formats in which people can use the data?  In principle, no.  In practice it may have some impact.

Obviously there is always a risk that someone may leak confidential or secure information.  Security clearance for those handling the information, monitoring of individual behaviour, restrictions on removal of data from secure platforms, etc – are all key measures in safeguarding such information.

This is quite different from a government department sharing data with the public where the data is of public interest e.g. analysis of spend on education by region or by age group, analysis of crime statistics by city or town.  But there are those who will look to confuse the two – where greater accountability is feared.

One final thought re open data – I am not sure that in all situations people have thought through the potential implications of publishing lots of data ie the ability of those receiving the data to cross reference and correlate that data.  In doing so these data analysts may point out trends that have gone unnoticed to date – while the data has resided in separate silos.

Enhanced by Zemanta

Change of UK government not slowing down data.gov initiative

Interesting to read Shadbolt’s take on the change of government in the UK, in the context of Linked Open Data:

This is another area in which Berners-Lee and Shadbolt are highly influential, having overseen the design and implementation of the UK’s open data portal, data.gov.uk. “The continuity of thinking on open data as we’ve transitioned between governments has been remarkable,” says Shadbolt. “In a parliamentary democracy, it’s very difficult to argue that the public doesn’t have a right to government data,” he adds.
Perhaps the next Irish Government may be able to apply some pressure to increase publication of DATA which belongs to you and me in a format in which we can actually do something useful with it.

Mind you I am reminded of previous discussion about the need for a government CIO and/or CTO in Ireland.


Enhanced by Zemanta

Peace time/ war time – need to tackle lots of data

Interesting piece in the Economist, under Artificial Intelligence, dealing with different ways of processing increasing amounts of data in war time or disaster situations such as earth quakes.  Author reminds us of the sheer volume and depth of information being gathered through sensors – and the requirement to process this using technology (because of the volumes).  This data may be through the use drones etc in a military situation or through crowd sourcing in a disaster situation.

Depending on one’s perspective this may be seen as a positive or yet another examples of the surveillance society.

Enhanced by Zemanta

Cleaning up dirty data

Just came across this – Google Refine – nice example of a product for cleaning up inconsistencies in data.  Unfortunately part of the linked open data movement is dealing with the realities of inconsistencies in data.

There are lots of products out there to assist in data cleansing efforts.  Thought this video gives a nice, practical example of the types of issues and how they can be addressed.  (Brought to my attention by @BarbaraStarr on twitter).

Enhanced by Zemanta

Pay to see full names for 3rd degree connections on Linkedin

Icon for the FOAF (Friend of a Friend) project...
Image via Wikipedia

So the pricing model has changed at Linkedin.  You may have noticed in searching that you are coming across people whose full name is hidden.  That’s the deal now – if you want to see these names you pay for the privilege.

Not that surprising really that a private network should look to make money from its database.  Must feel now that they have sufficient footprint (heading for 100m members) to up the anti.  Potentially why would they not go the whole hog and charge everyone?

All of this brings us back to the discussion around open standards, open networks, FOAF, semantics, etc.  And indeed David Siegel’s ‘The Power of Pull’ and his idea about the ‘persoanal information locker’.

Interesting to see how this plays out.  Will Linkedin changes results in slower growth in the network – but greater revenues to the company?  Or will this create the opportunity for another player to up their gorwth rate in the marketplace?

Enhanced by Zemanta

Ongoing commentary re privacy and social networks

The editorial in this morning’s Irish Times returns to the subject of privacy an the threat posed by social networks:

For some, new technologies raise troubling questions about Orwellian surveillance and the dangerous blurring of the public and private spheres. Most of these businesses, after all, are based on the premise that you, the user, are the product, with your personal data mined for the benefit of advertisers and other commercial interests. Such concerns are legitimate, but they are not the whole story; new technologies also offer potential for positive social change, greater accountability and transparency. They require governments and organisations to engage in more meaningful ways with their citizens and clients, and they can harness the power of the crowd to make sure that this actually happens.

I am reminded of comments previously made by analysts in this sector:  No personalisation without transparency.  It is a question of balance between what you are willing to share in order to receive relevant content/ suggestions.  Unfortunately ‘willing to share’ is often replaced by ‘inadvertent sharing’.

Interesting to see the editor balancing the threats posed with the potential benefits in terms of greater transparency and accountability.  I think the most practical step the Irish Government could take in this respect would be to participate actively in the growing movement of publishing data using linked open data formats.

Enhanced by Zemanta

Ireland and linked open data

What is the timeline for the Irish government in terms of linked open data? When you read newspapers full of stories about TD expenses, FAS waste, the objectives of An Board Snip – surely publishing data in meaningful, useful formats is part of the way forward. And it must be just one element of being a smart economy. And promoting a level of transparency (and accountability) which we crave as a society.

When I read pieces like Government Should Do its Own Data Homework by Jeni Tennison it just reminds me of the progress we need to make here in Ireland. And we have the expertise – in the IT community and, in particular, in DERI.

Perhaps there is an initiative – but I do not remember reading anything about a timeline.