Thursday saw a meeting on ‘Data Matters: Making the Most of Publicly-Funded Research Data ‘ organised by the Ministry of Research Science and Technology. The event was tweeted under the #ResearchDataMatters hash-tag on twitter, and I wrote my notes on my FriendFeed page.
The day consisted of a number of topical talks (great all of them) and a couple of brainstorming sessions by the individual tables. Julian Carver, who moderated the event did a wonderful job keeping us busy while sticking to the time schedule. It was indeed a great day filled with new ideas, and more importantly, new solutions.
It was clear that the room was filled with the vibe that opening the research data was not only important but also the direction in which we should move. The arguments in favour of this happening are quite compelling, and New Zealand can look inwards and abroad to find support for that position. There were also great examples of what New Zealand is doing in that respect, and that is also encouraging.
The central emerging theme that I think emerged from the day was that the questions about sharing data has moved from the if to the how domain. And the how is not an easy issue to solve, and one that occupies the time and thoughts of many advocates of open data. I think that these issues can be grouped into three broad categories: Ethical, Cultural, and those related to archiving.
In a way these are the ones that are relatively simpler to solve and probably encompass a narrow area of research primarily associated with Health (or other human) data. One of the concerns that was raised was that ethical approval and consent around the gathering of health data is bounded to specific studies that limit the ‘use’ of the data. A second concern is associated with privacy. I see these as relatively minor, since there are protocols in place for privacy, and ‘use’ can be redefined in the consent forms.
Cultural issues in the scientific community are a slightly higher hurdle to overcome, because it requires two things: a ‘buy in’ from the research community and a (I think) rather profound behavioural change that makes data archiving the default. There are heaps of issues around this, and I will probably leave it at that and come back to it on another post.
There was a general consensus that data should be shared. As Penny Carnaby said, if we invest in something because we think it is important, then we should also be thinking on how to preserve that knowledge. Or, what is the point of creating stuff if you then go ahead and delete it?
I also had the feeling that there was a general consensus that ultimately, it is not just about putting data on the web. Data is only useful if it can be discovered and as useful as how easily it can be re-purposed. But making data available in a meaningful reusable way is hard to do. Here is where my brain explodes, and where most of the talking centred around on the day.
There were a few things, however, that stuck with me and kept floating in my head as I took my flight back to Auckland.
One was a suggestion brought up by Andrew Treolar from the Australian National Data Service, about the need to make the data a ‘primary object’. Us researchers tend to think of the ‘paper’ as the end product, but he suggests this is a hierarchy that should also apply to the data. He suggests that data sets should be given a DOI, in the same way that manuscripts have, and this has several advantages. Not only does the data itself become a primary object, but the mechanisms to linking relationships between DOIs are already in place to create relationship and track citations between objects. DOIs have a further advantage and that is that attribution to the original source is inevitable. This idea solves at least in the interim some complex issues around data sharing.
A second point also brought up by Andrew Treolar, is that open data will probably be used to answer questions that are different to those for which the data was generated. This means that we researchers need to think of the description of the data beyond its original intention to facilitate re-purposing. And this is difficult, because how can I know what details will be needed when the question has yet to be posed? The minimum requirement would be to ensure that the data is properly described at least in terms of its origins and the steps through which it was obtained.
One of the things I also really took back with me was Penny Carnaby’s description of the work that the National Library of New Zealand has been doing around archiving of digital objects. She described the work done for the National Digital Heritage Archive (you can read about it here and here). The way I understand it, this system could provide viable solutions to some of the issues surrounding data archiving.
There is obviously a lot of work to be done, but it was encouraging to be in a room filled with people willing to be honest about the challenges, yet still enthusiastic about the road ahead. I will be interested in hearing what the follow-ups of Thursday’s meeting are, in particular the position of the funding bodies that were present in the room.
Megathanks to Jonathan Hunt and Julian Carver who made it possible for me to be there
Imagine you drive into a motel in Gatlinburg TN, and see behind an open room door 2 guys setting up cameras pointing at the beds while two young women peek from the parking lot. Well, if it was in the mid ’90′s it might have been Drs Moiseff and Copeland setting up the equipment before venturing into Elkmont in the Smoky Mountains to study the local fireflies. (And one of the two women would have been me.)
Andy Moiseff and Jon Copeland started studying the population of fireflies in the Smoky Mountains National Park after learning from Lynn Faust, who had grown up in the area, that they produced their flashes in a synchronous pattern.
In the species they are studying (Photinus carolinus) the males produce a series or bursts of rhythmic flashes that are followed by a ‘quiet period’. But what is particularly interesting about this species is that nearby males do this in synchrony with each other. If you stand in the dark forest, what you see is groups of lightning bugs beating their lights together in the dark night pumping light into the forest in one of nature’s most beautiful displays.
Females flash in a slightly different manner and, as far as I know, they don’t do it synchronously either with other females nor with the males. One interesting thing in Elkmont is that there are several species of fireflies, and you can pretty much tell them apart by their flashing patterns. But as useful as this is for us biologists (since it avoids having to go through extensive testing for species determination), the question still remained of whether the flashing patterns played a biological role.
And this is what Moiseff and Copeland addressed in their latest study published in Science. They put females in a room where LEDs controlled by a computer simulated individual male fireflies. The LEDs were made to flash with different degrees of synchronisation and they looked at the responses of the females. They found that while the females responded to synchronous flashes of the LEDs, they really didn’t seem to respond when the flashes were not synchronous. Even more, they responded better to many LEDs but not much to a single one. What this means, is that if you are a male of Photinus carolinus, you better play nice with your mates if you want to get the girl.
What *I* want to know is how this behaviour is wired in the brain. At first hand, this seems like a rather complex behaviour, but in essence all that it seems to require is a series of if/then computations, which should not be too hard to build (at least not from an ‘electronic circuit’ point of view). But Bjoern Brembs reminded me of a basic concept in neuroscience: brains are evolved circuits, not engineered circuits. So, Andy and Jon, how *do* they do it?
Original article: Moiseff, A., & Copeland, J. (2010). Firefly Synchrony: A Behavioral Strategy to Minimize Visual Clutter Science, 329 (5988), 181-181 DOI: 10.1126/science.1190421