Simon Cox, research scientist in environmental informatics, says that scientific vocabulary, including symbols and measurements, need to be more consistent.
You know what it’s like when you travel interstate and you want to order a beer – a schooner, a middy, a pot or a pint?
It can be awkward, confusing and often disappointing.
Simon Cox – research scientist in environmental informatics believes the same problem applies to science disciplines across the globe and risks getting in the way of good science.
Take ‘N’ – what does it mean to you? Is it nitrogen, neutral or north? Simon gave a keynote recently at SciDataCon – an international conference seeking to advance the frontiers of data in all areas of research.
“This means addressing a range of fundamental and urgent issues around the ‘Data Revolution’ and the recent data-driven transformation of research and the responses to these issues in the conduct of research,” Simon says.
According to Simon, information scientists are what librarians used to be until they saw the writing on the wall. The stacks have thinned, libraries have shrunk and they’ve become information managers – keywords, citations, and catalogues are still their stuff but now it’s all via the internet. And the massive exchange in information and ideas has been exponential.
It’s estimated that society generates more information in 24 hours than was generated between the birth of civilization and 2003. Along the way, we’ve gone from the cozy Dewey classification system to the DOI – Digital Object Identifier – in the hope of keeping track of all the books, articles and data sets.
“The big thing in public interest information, in science in particular, is not losing track of data sets. As the quantity of data increases exponentially we’ve got to ensure the availability of quality, transparent and well-managed data sets,” says Simon.
That’s why the DOI system is so important, but it’s also why you want to be comparing apples with apples.
Conformity in ‘vocabulary’ across sciences – and that could mean symbols as well as keywords, abbreviations, concepts and measures – has been slow to catch up with the IoT (internet of things).
“We assume symbols and keywords have some sort of shared meaning, at least in a given community, but that the reality is much less systematic,” says Simon.
“Symbols and abbreviations with no widely used consistent meaning are often used by researchers when creating data. Popular terms describing volume can mean entirely different things in different countries.”
Like the proverbial Mr Creosote wafer, it seems such a small problem – just one divergent symbol here, a small differential in measurement there. But it’s a burgeoning problem for scientists if they want to retain meaning in the exchange within and across disciplines, over languages, borders, agencies and governments.
There’s a solution: it’s all vocabularies.
There needs to be common, shared vocabularies in areas like units of measure, observable properties, analytes, chemical substances, taxa, units of measure, instruments, sensors, platforms, the geologic timescale, keywords. Oh, and did we mention……units of measure?
Simon thinks it’s time that the International Council of Sciences, and its respective discipline parts from the IUGS (International Union of Geological Sciences) to the IUPAC (International Union of Pure and Applied Chemistry) and the other 25-plus unions, take charge.
The International Commission on Stratigraphy was an early and courageous advocate of shared vocabulary. According to their website:
The Commission is the largest and oldest constituent scientific body in the International Union of Geological Sciences (IUGS). Its primary objective is to precisely define global units (systems, series, and stages) of the International Chronostratigraphic Chart that, in turn, are the basis for the units (periods, epochs, and age) of the International Geologic Time Scale; thus setting global standards for the fundamental scale for expressing the history of the Earth.
However, what they don’t provide is a unique web address (URL or URI) for each unit or boundary. In the web information context, this means the job is only half done.
Simon says the technical job is not insurmountable, but the real challenge is the social one – getting science communities to collaborate and take on the challenge.
(Oh, and when at the pub, if all else fails, go the craft – invariably, it comes in a bottle.)
Find out more about other exciting areas of our research here.
14th October 2016 at 3:38 pm
There are links to the slides and also to a video of the presentation on the conference site http://www.scidatacon.org/site/opening-keynote/ (skip to 29’15” in the video).
13th October 2016 at 4:40 pm
Consider how to implement changes to such a ‘controlled vocabulary’ over time could be handled.
11th October 2016 at 6:17 pm
This piece appears to have missed its April 1 deadline.