Hallucinated Citations and Related Problems

Many of my posts are based on applying research to personal or classroom practice. I am retired, so I am no longer involved in experiments myself, but I now spend time reading both new and older published studies on a topic that interests me. 

My change in location and social circles has led to some adjustments. I can’t walk across the street to a university library, though I still have access to online resources. Without students and colleagues, my interests are now far more self-driven and self-perpetuated. I have used Google Scholar since it was around, but the emergence of newer AI-supported tools for investigating the literature has been of great personal value.

This shift in how I locate the articles I read has exposed me to a strange phenomenon. I get excited when I find a reference relevant to a topic I have missed, particularly when it comes from an influential, productive researcher I follow. The title of this discovery sounds perfect and seems to promise just the type of evidence I have been looking for. I access my library’s online resources, call up the appropriate journal, and enter the title from the citation. The article isn’t there. Maybe the volume or the year of publication isn’t correct. I enter the title in Google Scholar to do a search and related articles appear, but there is no match for the specific paper I want. The citation that generated my excitement is very likely an AI hallucination. 

I first wrote about this issue several years ago when AI was itself less sophisticated and this problem was probably more common. I include the link to this previous post because it contains multiple examples of what such hallucinations look like. I decided to revisit the topic after reading a recent Nature article examining this issue. The recent article did a good job of explaining why such hallucinations seem so real, but also raised questions related to how such hallucinations could appear in newly published research and how and why scholars might end up citing and developing arguments in their own papers related to some literature that does not exist. 

The structure of a citation and why it results in hallucinations

The Nature study included a visual representation of a citation that I found helpful. I did not want to just cut and paste their examples so I had an AI tool develop something similar.

Think of a citation as consisting of several elements and understand that AI is not itself cutting and pasting what it offers in response to a prompt, but generates content. When this happens, some of the possibilities can result in fake outcomes.

  • Author may have published in this general area
  • Authors may have published together but not this paper
  • Words in the title are consistent with some of the work the author has done so are used to create the title
  • Pages fit with the date for this journal but are not appropriate
  • DOI (digital object identifier) – does not point to anything, but is similar to other DOIs for this journal

Ironically, trying to have an AI tool generate a plausible citation and identify its components also resulted in hallucinations (compare the image below with the one above). I tried multiple iterations to get what I wanted, but finally, I just had the tool generate the figure without lines, then used a different app to manually add them myself. 

What are the responsibilities of an author?

How hallucinated citations appear in published work raises other serious issues. Possibly, the author who submitted the paper used a tool to build the reference list, but did not then check the final product. More seriously, the author used AI to write sections of a paper complete with citations and did not actually read the original papers. 

Check your references

In my own efforts to explore relevant courses of action, I learned that many publications now rely on services that verify citation authenticity. I checked on the services and did not find anything that would be financially feasible for individuals. I did find that there are tools, some free, that will check a reference list. 

CiteTrue

CiteTrue is a free online tool that accepts a list of citations and checks each component for accuracy. The following image shows what this looks like. I used part of the list of hallucinated citations I included in the previous post on this topic I describe above, and pasted these into the input box. The output indicated all were inaccurate and speculated about what was incorrect.

Screenshot

Personal Comment

This is not an issue I personally worry about, as I am no longer an active researcher. I do cite sources in some of my posts when a reader cannot follow a link to the source. I admit that not all of my sources follow the APA (American Psychological Association) format. This is due to my laziness. I do read all of the papers I cite, but putting together a citation is sometimes a manual process of accurately pulling together different pieces of information from the pdf for that source. I often copy the title from the pdf and paste it into Google Scholar and then use the citation for that source provided by Google. I am unclear how Google assembles citations in its systems, but they do not always follow the most recent APA guidelines. For example, many do not include a DOI or list the authors in different ways. I know the titles work because that is how I find the citations. 

Loading