By March, the team had been working on the same dataset for four years. Seventeen papers were drafted. None were submitted. The bottleneck was not the science — the science was solved. The bottleneck was the synthesis: tens of thousands of pages of literature to be cross-referenced, reconciled, and grounded against an in-house corpus that no individual researcher could hold in their head.
The team was not short on intelligence. They were short on a way to deploy it.
This is the story of what changed.
The build
Over six weeks in April and May, our data scientists built a private RagBrain — what we call The Scientist — trained on sixty years of biomedical literature, peer-review correspondence, and the structural conventions of method papers in the Marvakin team's field. We then deployed it against the team's vault: every working document, every dataset, every prior submission, every reviewer note.
The result was not a chatbot. It was a piece of infrastructure.
"It is not a question of whether the model is smart. The question is whether the brain on top of the model can hold the literature, the priors, and the team's own thinking in one frame. Until now, the answer has been no." — Dr. Benford, principal investigator
What the team did with it
Within forty-eight hours of deployment, the team was using the Scientist RAGbrain to draft the introduction sections of three papers in parallel — each grounded in the team's own data and the relevant prior literature, with citations the team could open and verify in a single click.
By week three, the team had moved into method-section work — historically the most time-consuming portion of a biomedical paper, and the section most vulnerable to reviewer challenge. The Scientist had been trained to refuse to write a method section if the underlying protocol was missing or under-specified. This refusal was, paradoxically, what made the team trust it.
What it would not do
The Scientist would not write a discussion section without all four of the prior submissions in the same series indexed in the vault. It would not draw a conclusion against a dataset that had not been verified. When asked to extrapolate beyond what the corpus supported, it returned the same line every time: insufficient context to answer.
This is not a limitation. It is the product.
Most AI tools are graded on what they will say. RAGböx is graded on what it will not say. The team called this the discipline.
"I have spent half my career as a peer reviewer rejecting papers because the authors fabricated, knowingly or not. The Scientist will not let you fabricate. That is its single most important property." — Dr. Benford
The numbers
Of the nine papers submitted in the ninety-day window, six are now in revision at first-tier journals. Two are accepted. One is in re-review. The team's average submission-to-revision time is now eleven days, down from a multi-month historical baseline.
None of these numbers were possible six months ago. They are, today, the new floor.
What this means for everyone else
Marvakin is not a special case. It is a leading indicator. Every team whose output is bottlenecked by synthesis — every legal team, every diligence team, every regulatory team, every research team — is structurally identical to the Marvakin team. They are short on a way to deploy what they already know.
RAGböx exists to fix that.