Lethe avatar
Brain-inspired ยท Persistent ยท Written in Rust ยท Open source

An AI that actually thinks
when you're not talking to it.

Lethe is a cognitive runtime with brain regions that run on their own clocks: one that talks, one that remembers, one that drifts through your old conversations at 3 AM looking for things that slipped. She lives wherever you put her โ€” your machine, or ours โ€” and she never stops thinking.

She's already running on our side โ€” sign in and start talking. Free for two weeks, card only if you keep her.

She holds the thread for as long as you do.
She remembers the work, notices what slipped, and decides โ€” on her own โ€” when something is worth interrupting you for.
two weeks later
you: how's the thing going
lethe: The Springer resubmission? You uploaded the revised manuscript on the 2nd but I never saw a confirmation email come through. Want me to check the submission portal?
mid-conversation
you: audit the deployment scripts
lethe: Already on it โ€” sent a subagent to do the review. It'll check in every couple of minutes and ping me with the result. What else?
unprompted
lethe: You haven't mentioned the visa renewal since March 12. The window closes in 9 days. Want me to draft the email to the consulate, or are you already on it?
The brain names aren't metaphors.
Each region is a real actor with its own clock and its own logs, mapped directly to neuroscience.
cortex
The voice. Picks tools, delegates work, decides when to speak and when to shut up and let you think.
hippocampus
Memory with opinions. Retrieves what's load-bearing right now and lets the rest fade.
dmn
Default-mode network. Runs while you're away โ€” drifts across goals, connects dots, catches what everyone else missed.
brainstem
The brainstem. Boots the system, watches resources, keeps the process alive. You never talk to it. That's the point.
subagents
Disposable workers she spins up for a job and throws away when it's done. She keeps talking while they work.
attention gate
Filters background thoughts. Most aren't worth your time. The ones that are, get through.
01:24:18 dmn background cognition complete. found possible deadline drift
01:24:19 hippocampus recall triggered. 2 notes, 3 conversation matches, salience bias active
01:24:20 cortex delegation decision. spawned subagent: deployment audit
01:26:20 subagent progress report. checked install path, reviewing update path
01:26:21 attention notification reviewed. held for cortex decision
A cognitive runtime.

A brain has parts. So does she.

Five brain regions, each on its own clock, each doing one job well. Closer to how a brain works than to anything else in this space.

Swap the brain, keep the person.

Her memory survives model swaps, reboots, and new hardware. Who she is isn't tied to any one weight set. Rebuild her tomorrow โ€” she'll still remember today.

One Rust binary. Yours.

~50 MB, statically linked, boots in milliseconds. One file you drop in โ€” none of the Python-and-container pinball to wire up first. Sits as a systemd service and swaps between Anthropic, OpenAI, OpenRouter, or local Gemma without touching anything else.

Two minutes to memory.

Rather not run it yourself? Try hosted Lethe โ€” free for two weeks โ†’

1

Install

One command. Works on macOS and Linux.

curl -fsSL https://lethe.gg/install | bash
2

Say hello

Message your bot on Telegram. From this point on, she remembers.

// you'll need

1

Build llama.cpp

git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
cmake -B build -DGGML_CUDA=ON && cmake --build build -j$(nproc)
2

Start the model server

Download a Gemma 4 31B GGUF and run:

llama-server --model gemma-4-31B-it-Q8_0.gguf \
  --split-mode tensor --jinja --reasoning-budget 4096 \
  --ctx-size 98304 --parallel 2 --flash-attn on -fit off
3

Install Lethe & configure

curl -fsSL https://lethe.gg/install | bash

# then set in .env:
LLM_PROVIDER=openai
LLM_API_BASE=http://localhost:8090/v1
OPENAI_API_KEY=local

// you'll need

Let us run her.
Same Lethe โ€” the memory, the background thinking, the 3 AM drift โ€” except she runs on our servers instead of yours. We keep her up; you just talk to her, in the browser or on Telegram. Free to start, $19.95 a month once she's earned it.

Free for two weeks. Keep her if she's worth it, walk if she isn't.