May 12, 2025, by bbztlg

Chemistry’s data black hole: how much knowledge are we losing?

PhD researcher Joe Davies explores the silent data crisis and its impact on decades of scientific progress in chemistry.

“Don’t you just hate it when you misplace your valuable data from painstaking years of publicly funded research?”

SPortrait of Joe Davies smiling at the cameraynthetic chemistry has a serious data problem. And no, it’s not one you can fix with a few extra spreadsheets. Every day, in research labs worldwide, invaluable experimental data disappear – scribbled into paper lab notebooks, locked away in drawers, occasionally disposed of, and ultimately forgotten.

The knowledge that could drive new discoveries is slipping through our fingers because chemistry has yet to embrace the digital revolution that has transformed other sciences.

In an age where data is considered the “new oil”(link £) why is chemistry still running on coal?


From antiquarians to data stewards

“I don’t even have an Auntie Quarian”

Most academic researchers in chemistry use paper lab notebooks. After all, they are incredibly convenient tools and enable you to feel like the researchers of old. Take the historic and scientific treasures that are Marie Curie’s lab notebooks, for example, these are stored in lead-lined boxes in Paris where visitors must sign waivers and wear protective gear to view them. Hers are radioactive with a half-life of 1600 years. Radioactive or not, paper lab books belong in the past.

In today’s world, data should be findable, accessible, interoperable, and resuable (FAIR). But chemistry lags behind, rather than embracing the digital tools that have transformed other sciences.

How much valuable data are we losing?

“I really hope nobody learns from my past mistakes – they were just too embarrassing!”

When we post on social media, we post what we want to show off. We don’t post the time we stepped in dog poo and didn’t notice and had to rush to, try at least, to clean the carpet before joining a meeting on Teams.

We do the same when publishing results. We publish the things that work and that we are proud of. Not all the dumb mistakes we made along the way that informed us of the wrong way of doing things, from which we learnt the right way. These extremely valuable data, sometimes called “negative data”, are lost down the black hole when the paper lab book is stored in the dusty cupboard so that one day, years down the line, someone will open and not know any of the names listed inside as it transitions from a scientific resource to an archaeological one.

Publishing in journals is the standard way researchers share data, but current methods fall far short of what is needed. Experimental details are often buried in PDFs, making them difficult for machines to process and challenging for researchers to extract and reuse.

Electronic Laboratory Notebooks (ELNs): a solution we’re ignoring

“ELNs have been around longer than Netflix, so why are we still buffering adoption?”

Fortunately, a solution was developed… right around the time Blockbuster started to lose popularity. Electronic Laboratory Notebooks, or ELNs, have been used in the pharmaceutical industry for over a decade now. However, barriers – both legitimate and self-inflicted – have prevented their widespread adoption within academia.

The barriers to ELN adoption in academia

Selecting an ELN can be challenging: there are a lot of options to pick from, “what does it cost?”, “I want this one but nobody else in the department does”, “I don’t like change”, “how do I get my data out of it if I want to move to a different provider?”. Mostly legitimate reasons. Fortunately, there are resources out there to help. There is an ELN Finder website which lets you filter by domain, pricing, and more.

It’s unfair if data are not FAIR

“Mirror, Mirror on the wall, which is the FAIRest ELN of them all”

 Getting your data out of one ELN and into another, aka interoperability (the I in FAIR data) is a challenge. Cynically (or realistically if you want to be the cynical one), it is advantageous for a paid-for ELN to make it hard to get your data out in a format that can be loaded into another ELN, because then you are free to leave and stop paying them. This sort of decision can make the whole thing a bit intimidating and easy to put off for another day or year. Fortunately, there are initiatives where well-meaning ELN providers have grouped together to enable interoperability between their systems. The ELN consortium is one such initiative, where an ELN File format has been developed that can go from one ELN into another, transferring over all the data.

The future is digital – even for chemistry

 Furthermore, just to reiterate, it is 2025. Students expect digital solutions, and industry expects them to use digital methods. The future is digital (interestingly, even Marie Curie’s lab books have now been digitalised!), and the core message is an old simple one of “adapt or perish”. ChatGPT and AlphaFold have demonstrated the power of data from the vast volume of text on the internet or the Protein Data Bank. There are synthesis machine learning models, but “for chemists, the AI revolution has yet to happen”,  and there is no AlphaFold success story in synthesis because there is no equivalent data foundation.

Cartoon of stickman painting his chemistry set with green paint, rather than making it "environmentally green"!Want to make the switch? Go to AI4Green and join the green side

If you’re ready to stop losing valuable data to the black hole of paper notebooks but don’t want to spend weeks researching ELNs, try AI4Green.

I worked with other researchers at the University of Nottingham to develop an ELN called AI4Green. We take advantage of working in a digital environment by using AI and other computational tools to help you make your chemistry more sustainable. We think, and so do our users, that the ELN makes things easier by streamlining the data collection process.

* Monitoring and improve your sustainability
* Automated molar calculations
* Automatic hazard code retrieval
* Integrated AI tools
* A streamlined, user-friendly lab notebook experience

The future of chemistry is digital. Will you embrace it, or get left behind?

You can connect with Joe on LinkedIn

Posted in Researcher Academy