arXiv — the world's largest scientific preprint repository — now bans researchers for one year if they submit papers containing AI-generated errors: hallucinated citations, fabricated results, or leftover AI writing instructions. A Columbia University study published in The Lancet found that fake citations in scientific literature rose twelvefold between 2023 and early 2026. This is not only a graduate-level problem. Here is what every college-bound and current student needs to understand.

If you've used an AI tool to help research a paper and didn't check whether the citations it generated were real, this story explains what happens when someone doesn't.

arXiv — the open-access repository used by researchers across physics, mathematics, computer science, biology, and economics — announced a new enforcement policy: researchers who submit papers with obvious AI-generated mistakes face a one-year ban from the platform.1

Research integrity experts have called the move "welcome but unenforceable." That tension tells you something about where academic institutions are heading.

How Big Is the Problem?

A Columbia University research team published findings in The Lancet in May 2026 after auditing 2.5 million biomedical papers and 126 million references indexed on PubMed Central.2

Their data showed a sharp acceleration:

  • In 2023, roughly one in 2,828 papers contained at least one fabricated reference
  • By 2025, the rate had reached one in 458
  • In the first seven weeks of 2026, it climbed to one in 277

That is a twelvefold increase in fabricated citations in approximately three years. Separate research estimated that between 30 and 69 percent of AI-generated references in biomedical contexts are hallucinated — meaning the paper, journal, or author simply does not exist.2

AI tools generate convincing-looking citations with plausible author names, legitimate-sounding journal titles, and real-looking doi numbers. Most of these citations return a 404 error when searched.

What arXiv's Ban Actually Covers

The policy targets papers with "obvious" AI errors — not papers that used AI assistance generally. The specific examples: hallucinated references (citations to papers that do not exist), fabricated data or results, and documents that still contain leftover AI instructions, such as "add a conclusion here" or "cite three papers supporting this claim."1

The distinction matters for students. Using AI to brainstorm, outline, or draft text is a separate question from submitting a citation to a paper that does not exist. What arXiv is banning is the latter — fabricating a source, intentionally or not.

Most AI writing tools — including general-purpose chatbots — will generate plausible-sounding citations that do not exist. Before submitting any paper, verify every citation manually against a real database: PubMed, Google Scholar, or your library catalog. A cited paper that cannot be found is academic fraud, whether or not you knew the AI invented it.

Why This Reaches Undergrads Too

arXiv is primarily a research platform for graduate students and faculty. But the same AI citation problem exists at every level of academic writing, including undergraduate coursework.

Professors have learned to recognize the pattern: fluent, well-organized prose paired with footnotes that lead nowhere — journal articles from the right-sounding publication, with the right-sounding author, that return a search error.

Several universities have updated their academic integrity policies to address this directly. Our coverage of SUNY's AI policy across 64 campuses shows how broadly institutions are responding. The common thread: AI-fabricated citations are treated as plagiarism, not a gray-area mistake.

This connects to a wider reckoning around AI in academic settings. The admissions essay shift toward oral verification is one example. The Harvard faculty vote to cap A grades reflects the same broader pressure on academic standards. Institutions are responding to AI not by banning it, but by raising the cost of misuse.

The safest approach to AI in academic writing: use AI tools to understand topics, identify search terms, and brainstorm structure. Find real sources yourself. Treat any citation an AI tool suggests as a starting point for a search — not a verifiable reference. If you cannot locate the source independently in an academic database, do not cite it.

The Signal Beyond arXiv

The arXiv ban is not an isolated policy. EY retracted a cybersecurity report after discovering AI hallucinations. Deloitte faced similar scrutiny. Machine learning conferences including NeurIPS and ICLR have flagged AI-generated conference submissions with fabricated citations.1

If professional firms and elite research conferences are catching this problem, colleges are watching and adapting. The question for students is not whether institutions will continue tightening AI policies — they will — but whether your own habits are ahead of or behind that curve.

For students considering research-intensive fields, see our coverage of how AI is reshaping the careers students are choosing. And for a look at which majors are positioned for long-term stability regardless of AI disruption, see best majors for job security.

The practical bottom line: citation integrity is not a formality. It is the practice that makes the rest of academic knowledge usable — and institutions are now enforcing it directly.


Footnotes

  1. Nature. (2026, May). Researchers who use hallucinated references to face arXiv ban. Nature. https://www.nature.com/articles/d41586-026-01595-5 2 3

  2. Phys.org. (2026, May). AI-generated fake citations are flooding scientific literature across publications, scientists warn. Phys.org. https://phys.org/news/2026-05-ai-generated-fake-citations-scientific.html 2