AI is checking scientists’ work now

Last year, a study claimed black plastic cooking utensils contained dangerous levels of flame retardants linked to cancer.

But the alarm was false. A mistake in the research claimed the chemical levels seem ten times higher than they actually were.

Researchers later showed that AI could have caught this error in seconds.

This led to two AI-driven projects aimed at spotting mistakes in scientific research:

The Black Spatula Project, an open-source AI tool, has analysed 500 papers so far. Instead of making errors public, researchers are notifying affected authors directly.
YesNoError, a separate project, has scanned over 37,000 papers in two months. Unlike Black Spatula, it publicly flags possible mistakes, though many haven’t been confirmed by humans yet.

Both projects use LLMs to check papers for errors in facts, calculations, methods and references.

The AI pulls data from papers, runs multiple checks, and flags possible mistakes.

But a big challenge remains: false positives when AI wrongly detects errors.

The Black Spatula Project has a false positive rate of around 10%, meaning human experts need to verify flagged issues.

YesNoError tested its AI on 100 flagged mathematical mistakes. Of the authors who responded, all but one confirmed that the errors were real.

Despite their potential, these AI tools raise concerns.

Researchers worry about reputational damage if errors are flagged incorrectly.

Others fear the tools will flood the scientific community with minor issues like typos instead of real mistakes.

YesNoError’s system, for example, has already flagged many false positives.

Both projects aim to improve accuracy.

YesNoError plans to work with peer review platforms while Black Spatula continues collaborating with subject experts.

If these AI tools get better, they could help reduce errors in academic research.

Some PhD student out there is sweating because AI just flagged their whole thesis.