A typical engagement
A review, audit, or forensic engagement almost always starts the same way: the client provides two to six gigabytes of documents — PDFs, scans, accounting exports. The assigned junior spends the first three days trying to figure out where things are.
Those three days, you can save. And stop billing them as low-value work.
Step 1 — Sort the PDFs (Saturday morning, 2 hours)
Not a perfect sort. A quick sort, into four folders:
- Accounting (FEC files, trial balances, general ledgers, balance sheets)
- Contracts (articles of association, agreements, leases, registry extracts)
- Operations (invoices, quotes, engagement reports)
- Communication (client emails, letters, internal notes)
The assistant works better if each folder has a clear label. It does not work better if you rename every file individually — don’t spend the weekend on that.
Step 2 — Check the OCR (Saturday afternoon, 1 hour)
On scans, default OCR misses 5 to 15% of characters. On pre-2015 black-and-white scans, that climbs to 25%. Check by opening three or four PDFs at random and selecting text: if you get gibberish, enable “fine OCR” mode on the affected batch.
It’s ten times slower. It’s ten times more accurate. Essential on sensitive documents (contracts, rulings, leases).
Step 3 — Ingest in batches (Saturday evening)
In the platform, create an assistant named Engagement [Client] — [Period]. Configure it with watertight scope (never shared with other files).
Upload in 1-2 GB batches. Indexing runs in parallel; allow 10 to 15 minutes for 1,000 standard documents. Launch Sunday morning, go have breakfast, come back.
Step 4 — Test with ten questions (Sunday morning, 1 hour)
Ten questions, written before loading. That’s the rule — otherwise you adapt your questions to what you get and the evaluation is worthless.
Three easy questions (a date, an amount, a name), five realistic ones (synthesis, comparison, cross-document search), two hard ones (cross-reference, exception, ambiguity).
For each question, score: correct answer (yes/no), source citation (yes/no), hallucination (yes/no).
The three essential guardrails
- Per-engagement isolation. No client assistant should have access to another client’s data. Non-negotiable; it’s the GDPR baseline.
- Systematic citation. Every answer must come with the source document. If the assistant answers without citing, you have a configuration problem — fix it before going further.
- Polite refusal. If the information isn’t in the data, the assistant must say so (“I can’t find this information in the file”). No invention, no extrapolation.
Bottom line: one weekend between the PDF stack and a working assistant. Three days of engagement work saved, redirected to advisory hours. See also Good documentation makes a good assistant for the upstream audit phase.
Twenty minutes with a sample of documents from one of your files. We load, ask ten questions, and evaluate accuracy together.
Book a demo→