Install via docker compose
# docker-compose.yml
services:
karakeep:
image: ghcr.io/karakeep-app/karakeep:release
container_name: karakeep
restart: unless-stopped
ports:
- "127.0.0.1:3000:3000"
volumes:
- ./data:/data
environment:
DATA_DIR: /data
NEXTAUTH_URL: https://karakeep.example.com
NEXTAUTH_SECRET: ${NEXTAUTH_SECRET} # openssl rand -base64 36
MEILI_ADDR: http://meilisearch:7700
MEILI_MASTER_KEY: ${MEILI_MASTER_KEY}
BROWSER_WEB_URL: http://chrome:9222
# Optional: AI tagging (LLM provider)
OPENAI_API_KEY: ${OPENAI_API_KEY}
OPENAI_BASE_URL: https://api.openai.com/v1 # or your LiteLLM (see /tutorials/litellm-llm-gateway-proxy.html)
INFERENCE_TEXT_MODEL: gpt-5-mini
INFERENCE_IMAGE_MODEL: gpt-5-mini
depends_on: [ meilisearch, chrome ]
meilisearch:
image: getmeili/meilisearch:latest
restart: unless-stopped
environment:
MEILI_NO_ANALYTICS: true
MEILI_MASTER_KEY: ${MEILI_MASTER_KEY}
volumes:
- ./meili-data:/meili_data
chrome:
image: gcr.io/zenika-hub/alpine-chrome:123
restart: unless-stopped
command:
[
chromium-browser,
--headless,
--no-sandbox,
--disable-gpu,
--disable-dev-shm-usage,
--remote-debugging-address=0.0.0.0,
--remote-debugging-port=9222,
--hide-scrollbars,
]
docker compose up -d
docker compose logs -f karakeep
Reverse proxy
# Caddy
karakeep.example.com {
reverse_proxy 127.0.0.1:3000
}
First-run
Browse to the URL, register the first user (becomes admin). Disable open signups in Settings after the first batch of users are created.
How content is captured
When you save a URL, Karakeep runs (in background):
- Crawl — fetch the page; if the URL is dynamic, the headless Chromium executes JavaScript first.
- Extract — pull the article text via a Mozilla Readability-style extraction.
- Screenshot — capture a full-page rendered screenshot for visual reference.
- OCR — for image-heavy pages or saved images, run OCR for searchability.
- Index — push extracted text + metadata into MeiliSearch (see that tutorial) for fast full-text search.
- AI tag (optional) — if LLM keys are configured, generate tags + a one-line summary.
You now have the article text searchable even if the original site goes down. Click the screenshot in the UI for the rendered version.
Mobile apps + browser extensions
- Mobile (iOS + Android) — native apps. Share sheet integration: tap "Share" on any link from any app → Karakeep → saved.
- Browser extension — Chrome / Firefox / Safari extensions. Click the icon to save the current page; right-click any link to save without navigating.
- Web — paste any URL into the home page; same result.
- API — POST to
/api/v1/bookmarkswith a JSON payload from any script.
Lists + tags
Bookmarks can be organized in:
- Lists — manual folders ("Articles to read on the plane")
- Tags — flat labels; auto-generated by the LLM if enabled, plus manual additions
- Smart lists — query-based; "all articles tagged 'machine learning' with no read marker"
The combination of LLM auto-tagging + smart lists is the magic; "show me everything I've ever saved about Rust async" works without you ever manually tagging anything.
Highlights + notes
Per-bookmark: highlight text snippets, attach personal notes. Notes are markdown; highlights are stored alongside the source text for "what stood out from this article."
Auto-archive vs link-only
For sites that are unfriendly to crawlers (e.g. NYT paywall, Discord links), Karakeep stores the link only. For everything it can fetch, it fetches + archives.
For archived content, the saved view doesn't require re-fetching from the source. So when the original site dies, the article still lives in Karakeep.
Backups
All data lives under ./data/:
- SQLite database (or Postgres if configured) — bookmarks + metadata
- Asset storage — screenshots + extracted images + OCR text
- MeiliSearch index data (separate volume) — rebuildable from primary data but worth backing up too
Restic on the data directory (see that tutorial) covers everything. The MeiliSearch index will rebuild from the SQLite db if lost.
RSS feed import
For "save every new post from this blog automatically," configure RSS feeds; Karakeep polls them and auto-saves new entries. Pair with FreshRSS (see that tutorial) for "RSS reader for active reading, Karakeep for permanent archive."
Karakeep vs alternatives
- Pocket — Mozilla's discontinued / sunset hosted bookmarking service. The closest "save for later with article extraction" hosted equivalent. Karakeep replaces it self-hosted.
- Wallabag — older PHP-based self-hosted equivalent. Less polished UI, no AI tagging, no full-page screenshots.
- raindrop.io — commercial hosted service; rich features; not self-hosted.
- linkding — lightweight self-hosted bookmark manager; just links + tags, no content archiving.
- ArchiveBox — CLI-first; deep archival (full warc/png/pdf/screenshot per URL); heavier; less UI-friendly.
For "Pocket-shaped UX self-hosted, with content saved + LLM-tagged for free-text recall," Karakeep is the right pick in 2026.
The "research vault" workflow
Combine Karakeep with AnythingLLM (see that tutorial) or your favorite LLM tool: export Karakeep's archived content into a RAG workspace; the model can now answer questions grounded in everything you've ever saved. Personal-knowledge-base level of useful, without leaking your reading history to a third party.