Document Citations
The following is an implementation of document Q&A with citations using Anthropic’s press release for Claude Citations, a feature meant to reduce LLM hallucinations and provide clarity for which part of the source text the model is using to produce answers to questions. The content below the next page break is from Anthropic (the source document in our example). Below are example questions that can be asked about the document. When provided with the document as context, an LLM can generate answers to these questions.
Try it out
Select a question to view the model-generated answer using the source document. Doing so will also highlight the areas of the document that Claude returned as citations, justifying the answer. If you scroll to the bottom of an answer, you can also find links to the cited passages.
Introducing Citations on the Anthropic API
Today, we’re launching Citations, a new API feature that lets Claude ground its answers in source documents. Claude can now provide detailed references to the exact sentences and passages it uses to generate responses, leading to more verifiable, trustworthy outputs.
Citations is generally available on the Anthropic API and Google Cloud’s Vertex AI.
Trust by verification
All Claude models are trained to be trustworthy and steerable by design. Citations builds upon this foundation, addressing a specific need in AI applications: verifying the sources behind AI-generated responses.
Previously, developers relied on complex prompts that instruct Claude to include source information, often resulting in inconsistent performance and significant time investment in prompt engineering and testing. With Citations, users can now add source documents to the context window, and when querying the model, Claude automatically cites claims in its output that are inferred from those sources.
Our internal evaluations show that Claude’s built-in citation capabilities outperform most custom implementations, increasing recall accuracy by up to 15%.1
Use cases
With Citations, developers can create AI solutions that offer enhanced accountability across use cases like:
- Document summarization: Generate concise summaries of long documents, like case files, with each key point linked back to its original source.
- Complex Q&A: Provide detailed answers to user queries across a large corpus of documents, like financial statements, with each response element traced back to specific sections of relevant texts.
- Customer support: Create support systems that can answer complex queries by referencing multiple product manuals, FAQs, and support tickets, always citing the exact source of information.
How it works
When Citations is enabled, the API processes user-provided source documents (PDF documents and plain text files) by chunking them into sentences. These chunked sentences, along with user-provided context, are then passed to the model with the user’s query. Alternatively, users can provide their own chunks for the source documents.
Claude analyzes the query and generates a response that includes precise citations based on the provided chunks and context for any claims derived from the source material. Cited text will reference source documents to minimize hallucinations.
This approach offers superior flexibility and ease of use, as it doesn’t require file storage and seamlessly integrates with the Messages API.
Pricing
Citations uses our standard token-based pricing model. While it may use additional input tokens to process documents, users will not pay for output tokens that return the quoted text itself.
Customer spotlight: Thomson Reuters
Thomson Reuters uses Claude to power their AI platform, CoCounsel, helping legal and tax professionals synthesize expert knowledge and deliver comprehensive advice to clients.
“For CoCounsel to be trustworthy and immediately useful for practicing attorneys, it needs to cite its work. We first built this ourselves, but it was really hard to build and maintain. That’s why we were excited to test out Anthropic’s Citations functionality. It makes citing and linking to primary sources much easier to build, maintain, and deploy to our users. This capability not only helps minimize hallucination risk but also strengthens trust in AI-generated content. The Citations feature will enable us to build an even more accurate and thorough AI assistant for lawyers,” said Jake Heller, Head of Product, CoCounsel, Thomson Reuters.
Customer Spotlight: Endex
Endex uses Claude to power an Autonomous Agent for financial firms.
“With Anthropic’s Citations, we reduced source hallucinations and formatting issues from 10% to 0% and saw a 20% increase in references per response. This removed the need for elaborate prompt engineering around references and improved our accuracy when conducting complex, multi-stage financial research,” said Tarun Amasa, CEO, Endex.
Get started
Citations is now available for the new Claude 3.5 Sonnet and Claude 3.5 Haiku. To start using Citations, explore our documentation.
Claude Citations outputs the exact passage from the cited document as well as start and end indices referencing the slices of the original document (as a string) that contain the citations.
We unpack the inference response from Claude into a data structure for this post like this:
{
"question": "Which Claude models support Citations?",
"answer": "According to the documents, Citations is available for Claude 3.5 Sonnet and Claude 3.5 Haiku. The feature is generally available on both the Anthropic API and Google Cloud's Vertex AI.",
"citations": [
{
"text": "## Get started\n\nCitations is now available for the new Claude 3.5 Sonnet and Claude 3.5 Haiku.",
"start_index": 4114,
"end_index": 4209,
"document": "Citations on the Anthropic API"
},
{
"text": "Citations is generally available on the Anthropic API and Google Cloud\u2019s Vertex AI.",
"start_index": 315,
"end_index": 400,
"document": "Citations on the Anthropic API"
}
]
}
To make the above UX work as rendered HTML, we need to first add HTML markup to the raw markdown string before rendering, since the cited passages and document indices would no longer match the content after rendering the markdown as HTML.
To add the highlighting, we iterate through the citations from end to beginning to avoid interfering with the citation start and end indices.
Within each citation, we break the content up by line and wrap each line with <mark>
tags, with special case handling for markdown elements like headings and lists.
Finally, we render that HTML/markdown mix as HTML, resulting in the highlighted document upon selecting a question.
Challenges
Finding a way to add the highlights to the document was the biggest challenge because we must do so before the markdown is rendered as HTML (the citations are in markdown) but we also needed to avoid interfering with the markdown to HTML renderer so that the HTML markup would be added correctly. For example, a common problem I ran into was correct highlighting but content that was sent to the DOM as
## Get started
rather than
<h2>Get started</h2>
This issue was due to a difference of the preprocessor outputting
<mark class="citation-highlight"> ## Get started </mark>
instead of
## <mark class="citation-highlight"> Get started </mark>
Future work
With a similar approach, we could add a chat to a document and call the Citations API in real-time with whatever question a user inputs, displaying both the response and the highlighted citations from the source document. This pattern is an inversion of approaches like RAG, which put the Q&A front and center rather than the source document. In cases of knowledge aggregation this may be preferred, but when focusing on a single text, it’s easy to become disconnected from the source material. Keeping the source content front and center helps maintain better grounding in the original text while still getting the benefits from the power of LLM summarization.