Accelerating Answers with Retrieval‑Augmented Generation (RAG)

July 12, 2025
3 min Read

Imagine asking a medical librarian a question: they pull the most relevant papers from the shelves and then craft a clear explanation for you. Retrieval‑Augmented Generation (RAG) does the same thing in software. First it retrieves supporting documents, then an AI language model generates an answer grounded in that evidence. The result: conversational responses that stay tethered to real data.

In oncology, where every detail matters and the stakes couldn't be higher, this grounding is essential. A misplaced decimal can change a treatment plan. An overlooked lab trend can delay critical interventions. That's why Docviser's RAG implementation goes beyond the basics—we've built a system that understands medical context, preserves clinical relationships, and delivers answers you can trust.

 

The limits of "simple" RAG

Early RAG systems worked like a paper shredder: they chopped every file into equal‑sized chunks and hoped the right snippet would turn up later. That approach has two big drawbacks:

Fragmented context – Key details often live across chunk boundaries, so the model can't see the full picture.

Noise in, noise out – If unrelated text slips into the retrieval set, answers become fuzzy or inaccurate.

In medical settings, these limitations are unacceptable. A treatment decision might depend on understanding the relationship between a lab result from last week and a symptom noted yesterday—information that could easily be split across different chunks in a naive RAG system.

 

Agentic RAG – a smarter way to chunk

Docviser uses an agentic form of RAG. Think of it as a smart assistant that actively decides how to slice each document:

Adaptive chunk sizes – The agent breaks text at logical places (section headers, tables, signature blocks) rather than fixed token counts.

Context‑aware linking – It preserves relationships between sections so the language model can reason across them.

The payoff is better grounding and sharper answers—especially important in oncology, where precision matters more than anywhere else.

 

Docviser's secret advantage: knowing the document type

Most RAG pipelines treat every upload the same. Docviser is different. When clinicians upload a file, they tag it as lab result, imaging report, observation note, and so on. That simple choice unlocks specialised processing:

 

What Docviser's pipeline does with each document type

Document Type   Pipeline Action
Lab result   Parses value‑unit pairs, normal ranges, and timestamps for trend analysis.
Observation note   Detects problem–action statements to flag changes in patient status.
Imaging report   Extracts key findings and links them to body sites.

 

Because we know what a file is, we can apply the right extractor, store clean structured data, and feed more precise context back into the model.

 

Leveraging Docviser's own structured data

Docviser isn't starting from scratch; it already holds rich, well‑organised information:

  • Document types and section names are indexed the moment a file arrives.
  • Previous prescriptions are converted to plain text so the model can reference dosing history.
  • Patient demographics, tumour staging, and biomarker results live in readily searchable fields.

This internal knowledge graph acts as a first‑class retrieval source, reducing the need to scan every PDF again and again.

 

Handling the wild cards: unstructured uploads

When a user drops in a scan or photo—anything without metadata—our Agentic RAG kicks in:

Visual‑text extraction – OCR or vision models pull the raw text.

Proposition‑based chunking – Instead of slicing by character count, we split at complete statements (propositions). It's pricier to run, but keeps context intact.

Dynamic retrieval – An agent chooses which propositions answer the doctor's question and passes only those to the language model.

Since introducing proposition chunking, we've seen a marked jump in answer accuracy and fewer "hallucinated" citations.

 

Why this matters for oncology care

Faster case reviews – Oncologists get precise answers in seconds instead of paging through 40‑page PDFs.

Higher trust – Every response links back to the exact page in the source document.

Less data entry – Structured fields are filled automatically, so doctors spend less time typing and more time with patients.

The impact is measurable: clinicians report spending 30% less time on documentation and feeling more confident in their treatment decisions when they can quickly verify information against source documents.

 

Coming up next

Our next article will explore Docviser Agents—the dynamic sidekicks that plan multi‑step reasoning paths when a single query isn't enough. While RAG provides the foundation for grounded answers, Agents take it further by orchestrating complex clinical workflows that require multiple steps, conditional logic, and human-in-the-loop validation.


This piece is part of our series on Docviser's AI‑enabled oncology platform. Previous: Streamlining Oncology Care with Docviser Workflows. Next: Docviser Agents and Multi-Step Clinical Reasoning.

Featured Articles.