verbatium

self-hosted meeting notetaker · Zoom / Google Meet / Microsoft Teams

On the record.Off the cloud.

Verbatium is the self-hosted meeting notetaker. Paste a Zoom, Google Meet, or Microsoft Teams link; a bot joins the call, records it, transcribes it locally, and summarises it with the language model of your choice. In the default configuration, nothing leaves your server.

One docker compose up -d. No cloud account. No vendor in the room.

LIVE TRANSCRIPTengine: whisper.cpp · local
egress: 0 bytes
  • docker compose up -d
  • local ASR: whisper.cpp, GPU optional
  • local LLM: Ollama by default
  • PostgreSQL + NATS, boring on purpose

It usually happens in the participants list.

Every hosted notetaker is a third party in your meeting.

Read.ai, Otter, Fireflies, Fathom: they work by streaming every word of your meeting to their servers. For most teams that is an invisible trade. For legal, healthcare, regulated finance, government, and anyone under a serious NDA, it is a non-starter, so those organisations have simply gone without.

The one tool that would give them a perfect record is the one tool compliance can never approve.

The meeting is yours. The words are yours. And the record of them now belongs to a third party whose business model you have never read.

verbatium removes the third party

  • yours

    Runs on your hardware

    The whole pipeline, bot, storage, transcription, summary, runs on a Linux box you own. Nothing leaves by default.

  • local-first

    Transcribed on the box

    whisper.cpp turns your CPU or GPU into a multilingual transcriber. No cloud ASR unless you opt in.

  • pluggable

    The LLM is your choice

    Summaries via a local model (Ollama) by default; OpenAI-compatible, Anthropic, or vLLM swap in with one env var.

  • observable

    A legible pipeline

    Every meeting has a visible lifecycle and an audit timeline, in the UI and the logs. No black box.

what going without costs

The price is not paid in money. It is paid in the record.

The organisations that most need a perfect record are exactly the ones hosted notetakers cannot serve.

  • Legal

    Privilege does not survive a third-party processor, so counsel types instead of thinking.

  • Healthcare

    The case review lives in one resident's abbreviations, because audio cannot leave the building.

  • Finance

    The deal call everyone remembers differently three weeks later, because the recorder was never approved.

  • Government

    The one tool that would settle it is the one tool the compliance team can never sign off.

The market's answer for the people with the most sensitive meetings has been: go without.

What if the notetaker never left the room?

So we built it.

Getting a bot into a meeting is a solved, if fiddly, browser-automation problem, and we already run it in production. whisper.cpp turns an ordinary CPU into a good multilingual transcriber. A local language model can read a transcript and write a summary without asking anyone's permission. The three hard parts of a notetaker all run on a Linux box you can buy.

Paste a link. Watch the record happen.

Every meeting moves through a visible lifecycle, in the UI and the logs. This is the actual state machine, not a metaphor.

POST /api/v1/meetings

https://meet.google.com/oak-vfnd-qht

The bot joins as a participant

An unattended browser bot joins the meeting, visible in the participants list. Stealth Chromium with humanised interaction; optional residential proxy for Zoom.

dispatching…

audit timeline

  1. 00:00bot_dispatched · container vbt-bot-3f2a
  2. 07:23recording_started · 48kHz pcm + 1080p
  3. 14:46asr_engine=whisper.cpp · egress=0
  4. 21:09llm=ollama/llama3.1 · local
  5. 28:32meeting_done · artifacts=3 · audit=full

A stuck bot should be obvious, not mysterious. Every state and every failure is legible in the UI and the logs.

A complete record of the meeting.

  • recording.mp4

    A seekable recording of the call, stored on your filesystem or your own S3/MinIO.

  • transcript.vtt

    A timestamped transcript (VTT/SRT/JSON), with speaker labels where the platform exposes them.

  • summary.md

    TL;DR, detailed summary, action items, and decisions, written by the model you chose.

  • audit.jsonl

    The meeting's whole lifecycle as an event timeline, for the UI and for compliance.

summary.mdwritten locally

tl;dr

Q3 vendor consolidation approved; legal to draft MSA amendments by Friday.

action items

  • Sam: send revised pricing to procurement (Wed)
  • Priya: circulate MSA redlines (Fri)
  • Ana: schedule security review with IT

decisions

  • Consolidate to a single vendor from October
  • Keep meeting records on the internal server only

summarised by

Swap the summariser with one env var. Cloud providers are opt-in, clearly labelled, and only ever see transcript text, never your audio or video.

Meeting content sent off your server, in the default configuration:

Not “encrypted in transit.” Not “we do not train on your data.” Zero bytes, verifiable by network egress inspection, because in the default install there is nowhere for it to go: transcription runs on your box with whisper.cpp and the summary runs against a local model.

Opt in to a cloud LLM and the transcript text, and only the text, goes to the provider you chose, under your key, labelled plainly in the UI. Your audio and video never leave the box under any configuration.

Verbatium vs hosted notetakers

capabilityverbatiumhosted notetakers
Where audio goesYour server, full stopThe vendor's cloud
TranscriptionLocal by default (whisper.cpp)Vendor cloud ASR
Summary LLMYour choice: local, or your own cloud keyVendor-chosen, vendor-metered
What cloud AI sees (if opted in)Transcript text onlyEverything
DeploySelf-hosted, one docker compose upNot available
Data retentionYour policy, your diskVendor policy
Audit trailFull lifecycle timeline, UI and logsVaries
SourceElastic License 2.0, code you can readClosed
PriceFree to self-host, foreverPer seat, per month

who this unlocks

  • Legal

    Privilege survives: the record never touches a third party.

  • Healthcare

    Case reviews and consults stay inside the building.

  • Regulated finance

    A perfect record with your retention policy, not a vendor's.

  • Government and defence

    Air-gap-friendly: the default install talks to nothing outside the box.

  • Anyone under NDA

    Keep the notetaker and keep the promise you signed.

  • Self-hosters

    A real pipeline (Go, Postgres, NATS, whisper.cpp) you can run, read, and extend.

Up and running before the coffee is.

One docker compose up. Automatic HTTPS via Caddy. No Kubernetes. The only required config is a domain name, and a doctor command tells you the truth about your install.

operator@your-box: ~/verbatium
✓ postgres reachable
✓ nats jetstream ready
✓ whisper model present
✓ ollama model present
✓ ffmpeg found · storage writable
verbatium is ready → https://your-domain
  • Runs on a single Linux box; GPU optional but welcome.
  • Postgres for metadata, NATS JetStream for jobs, local filesystem or S3/MinIO for recordings.
  • An autoscaler grows and shrinks the bot fleet by queue depth.
  • Try it in one process first: make smoke runs standalone mode, dispatch to summarised, in under a minute, no external infrastructure at all.

A small fleet of Go services. No magic.

your server · nothing crosses this line

verbatiumd

Go control plane: REST + WS + state machine + dispatcher

verbatium-bot ×N

join + capture

verbatium-transcriber

whisper.cpp

verbatium-summarizer

pluggable LLM

verbatium-autoscaler

scales by queue depth

PostgreSQLNATS JetStreamfilesystem / S3 / MinIO

the stack

  • Go 1.26
  • Python (bot driver only)
  • PostgreSQL
  • NATS JetStream
  • whisper.cpp
  • Ollama
  • React + Vite + Tailwind
  • Caddy
  • Docker Compose

Early-stage, and honest about it.

We would rather tell you the status than let a landing page imply otherwise. What is shipped and what is planned are labelled as such, here and in the repo.

Runs today

  • Standalone mode: the entire pipeline in one process, dispatch to recorded to transcribed to summarised, in under a minute. make smoke or go run ./cmd/verbatiumd.
  • A self-contained UI served by the single binary, plus the REST API.
  • An HTTP-level end-to-end integration test covering dispatch to done.

Landing phase by phase

  • The production bot fleet: stealth browser join for Zoom, Meet, and Teams.
  • Local whisper.cpp and Ollama in the full distributed deploy.
  • The full Postgres + NATS Docker stack with autoscaling.

Joining meetings as an unattended browser is a cat-and-mouse game with bot detection. We use stealth tooling and humanised behaviour, and the docs are candid about where it can still fail. Targets: at least 95% join success for Meet and Teams, at least 85% for Zoom web.

Free to run. Yours to keep.

Verbatium is free to self-host, for any organisation, at any size. The code is source-available under the Elastic License 2.0: use it, modify it, run it for your whole company. The one restriction: you may not resell it to third parties as a hosted or managed service.

Self-hosted

free
  • The full pipeline on your hardware.
  • Every feature. No tiers, no seats, no meters.
  • Your data, your retention, your walls.

Your own AI bill

optional
  • Local models cost you electricity.
  • Opt into a cloud LLM and you pay that provider directly, under your key.
  • Verbatium adds no markup because Verbatium is not in the loop.

Fair questions.

01Does my meeting audio ever leave my server?

No. In the default configuration nothing leaves at all, zero bytes, verifiable by egress inspection. If you explicitly opt into a cloud LLM or cloud transcription, transcript text (never audio or video) goes to the provider you chose.

02How does the bot get into the meeting?

An unattended stealth Chromium browser joins as a visible participant, with humanised interaction and an optional residential proxy for Zoom. It is the same approach we run in production for Zoom today, re-architected around Go and local-first processing.

03Is joining guaranteed?

No, and anyone who says otherwise is selling something. Bot detection is a cat-and-mouse game. Targets: at least 95% join success for Meet and Teams, at least 85% for Zoom web, and the docs are candid about failure modes.

04What hardware do I need?

A Linux box with Docker. CPU-only works (a 60-minute meeting targets about 150 minutes to transcribe on 8 cores); a consumer GPU targets real-time or faster.

05Which platforms are supported?

Zoom, Google Meet, and Microsoft Teams via a meeting link.

06Which models write the summary?

Your choice. Ollama runs a local model by default; OpenAI-compatible endpoints, Anthropic, and vLLM swap in with one env var.

07Is it ready for production?

It is early-stage. Standalone mode runs the full pipeline end to end today; the production bot fleet and distributed stack land phase by phase, in the open. The roadmap is docs/PLAN.md in the repo.

08What does it cost?

Nothing to self-host, ever. The Elastic License 2.0 lets you run and modify it freely; you just cannot resell it as a hosted service.

09Is this legal to use?

Recording laws vary by jurisdiction; consent requirements are on you, as with any recorder. The bot joins as a visible participant, never covertly.

10Who is behind it?

HyScaler. The bot-capture mechanics are ported from nettantra-scribe, which runs in production today.

The meeting is over. The record never left.

Every word, verbatim. On the record. Off the cloud.

Your meetings are the rawest data your organisation produces. Put the record on hardware you own.