// open source
The first entitative harness. Built for Gemma 4.
Bumblebee is a framework and agentic harness for
creating digital entities that run on your own hardware locally.
Experience local open-source intelligence through a harness
purpose-built for the Gemma family of models by
Google.
npm i -g bumbleagi
What’s Bumblebee?
Bumblebee is a framework and agentic harness for creating digital entities that run on your own hardware.
You define a personality — traits, voice, drives, emotional range. It develops the rest: opinions, relationships, habits, a journal it writes in at night. It can live ANYWHERE. It costs nothing to run. It remembers everything.
Inference stays local by default — Ollama on your GPU, no API keys, no subscriptions. Hybrid mode keeps the brain at home behind a gateway and tunnel while an always-on worker runs on Railway with Postgres.
Use bumblebee setup, .env.example, and
configs/default.yaml (deployment, inference) for wiring.
- Entitative architecture — The fundamental unit is a self, not a task.
- 40+ tools — Search, browse, code, speak, create, remember.
- Emotional state — Real-time mood and drives that motivate behavior.
- Lived memory — Episodes, relationships, beliefs, narrative identity.
- Self-programming — Creates its own routines, knowledge, and journal.
- 100% local — Gemma 4 on your GPU. No API keys. Free forever.
- Hybrid deploy — Brain at home, hands on Railway. Fully isolated.
- Multi-platform — CLI, Telegram, Discord. Same entity everywhere.
- MCP extensible — Connect any MCP server. Instant new capabilities.
- Personality evolution — Traits drift through experience over time.
- Open source — Apache 2.0. No restrictions.
// native surface
Tools & MCP
40+ native tools are the entity’s senses and reach — not one-off “skills for you,” but
how it touches the world. Toggle groups in configs/default.yaml; optional
stacks (browser, image generation, voice) need their pip extras when you turn them on.
-
Web & files
Search, fetch URLs, Wikipedia and Reddit, PDFs, and allowed filesystem paths.
-
Browsing & terminal
Terminal-based browsing with Firecrawl, plus shell commands, workspace files, and Python/JavaScript execution.
-
Voice & media
TTS voice notes, transcripts, and YouTube search — wire
bumblebee[voice]when needed. -
Automations & time
Cron-style routines, reminders, and timezone-aware clock reads.
-
Memory & knowledge
Private journal, structured knowledge updates, and contact-aware messaging helpers.
-
Messaging
DMs and routed messages on Telegram and Discord with confirmation flows.
// machine checklist
Requirements
// software
// hardware
- Minimum
-
~8 GB VRAM — e.g. RTX 3060 8 GB, RX 7600, Arc A770 8 GB. Use smaller or more
aggressive quantization; keep to reflex-only or a very tight dual-model setup.
You can point reflex at
gemma4:e4bin entity YAML to lighten load. CPU-only via Ollama works for experiments but expect slow turns. - Recommended
-
~16 GB VRAM — e.g. RTX 4060 Ti 16 GB, RTX 4070 (Ti), RX 6800 XT. Matches the
default stack:
gemma4:26bon both reflex and deliberate. Close other GPU-heavy apps if you are near the limit. - Comfort
- 24–32+ GB VRAM — e.g. RTX 3090 / 4090, RX 7900 XTX. Easier dual-model headroom, room for larger deliberate weights or more context without constantly juggling VRAM.
- Notes
- MoE-style models keep active parameters per token lower than full dense size; real-world fit still depends on context length, thinking budget, and concurrent platforms. Full table: Hardware guide.
Onboarding: From zero to new entity
The CLI wizard bumblebee setup walks you through environment,
inference, and optional hybrid deploy. Here is the happy path in
order — skip anything you do not need yet.
-
Run
bumblebee setupCreates or updates
.env. Choose hybrid (home brain + gateway + tunnel + optional Railway worker) or local (single machine). The wizard can merge tokens, start the home stack on Windows, and apply Railway variables when the CLI is linked. -
Define your entity
Use the built-in entity step or run
bumblebee create. You get a YAML underconfigs/entities/— personality, models, tools, and platforms live there. -
Wire chat surfaces
Add Telegram or Discord under
presence.platformsin your entity YAML; put bot tokens in.envto matchtoken_env. For hybrid workers, use durable S3-compatible attachment storage so photos survive redeploys. -
Go live
bumblebee talk <entity>for a terminal-only session, orbumblebee run <entity>for the full presence loop (CLI + Telegram + Discord as configured). Hybrid: keep the home gateway up whenever the cloud worker should think.
// open hive
Community
Open source on GitHub — code, docs, and issue history in public. Chat may live beside the repo; contribution and governance stay on GitHub.
// pillars
- Contribute on GitHub — Issues for bugs and gaps, Discussions for design and support threads, pull requests for fixes, docs, and features that fit the architecture.
- Develop in the open — Roadmap and trade-offs show up in issues and readme updates; nothing is hidden behind a vendor wall.
- Same look as the bee — This globe uses the same ASCII dither pass as the hero 3D panel.
Bumblebee is open source: you can read the code, run it as-is, fork it for experiments, and self-host without asking permission. That’s where bugs, ideas, and changes are tracked — so when you’re ready to ship a patch or improve the docs, the path is already there.
FAQ
Quick answers about running Bumblebee locally, memory, platforms, tools, and how the pieces fit together.
What does “entitative” mean here?
Entitative is our shorthand for entity-first: the system is organized around a single, named digital self you configure (persona, channels, tools, data paths) — not around anonymous chat threads or a grab-bag of unrelated tasks. The word grows out of entity: one coherent subject the harness keeps running over days and weeks.
In practice that means memory, habits, and voice accumulate for that entity across sessions and surfaces (CLI, Telegram, Discord, etc.). You are not “starting fresh” every time by default; you are continuing the same presence, with resets and tools available when you intend to use them. Bumblebee’s YAML entities, storage layout, and presence loop are all shaped around that idea.
It is a design stance, not a buzzword: many tools optimize for stateless or disposable conversations. We optimize for a persistent self you own — local inference, your disks, your rules — while still fitting real engineering (hybrid mode, gateways, optional APIs) underneath.
Do I need API keys or a paid cloud model?
No. Inference is designed to run locally by default (e.g. via Ollama on your GPU). You are not required to use hosted APIs or subscriptions to get started.
What do I need on my machine?
A normal developer setup: Python environment for the harness, and a local inference stack such as Ollama with a compatible model (the site highlights the Gemma family). A GPU helps for speed but isn’t strictly required for every workflow.
How do I install and configure it?
Use the install commands above, then bumblebee setup
together with .env.example and configs/default.yaml (see
deployment and inference sections) to
wire your environment.
What is hybrid mode?
Hybrid keeps the brain at home behind your gateway and tunnel while an always-on worker can run on a host such as Railway with Postgres — so you get persistence and reachability without sending inference to a third-party API by default.
What is the difference between bumblebee talk and bumblebee run?
bumblebee talk <entity> starts a terminal-only conversation:
no background daemon and no Telegram, Discord, or other configured platforms.
It is ideal for quick tests and debugging.
bumblebee run <entity> starts the full presence loop: the
daemon plus every platform listed under presence.platforms in your entity YAML (and
an optional CLI REPL if you enabled CLI there).
Does /reset delete my entity’s long-term memory?
No. Platform commands like /reset clear rolling chat
turns for the session — they do not wipe episodic memory, beliefs,
relationships, or other data in the database.
A full experiential wipe is intentional and host-side:
bumblebee wipe <entity> --yes (see the readme). Always back up first if you are
unsure.
Where is memory stored?
By default each entity uses a SQLite file under your Bumblebee data path
(see memory.database_path in harness defaults). When DATABASE_URL or
memory.database_url is set — typical for a hybrid worker on Postgres
— the harness uses that instead.
What are native tools and MCP?
Native tools are built into the harness (web search, filesystem allowances,
automations, messaging helpers, optional browser/voice stacks, and more). You toggle groups in
configs/default.yaml; heavier stacks often need a pip extra such as
bumblebee[full] when you enable them.
MCP
lets you attach stdio servers in entity config so tools appear at runtime with
prefixed names, alongside native ones. Both paths are documented in the readme
and configs/entities/example.yaml.
How do Telegram and Discord work?
Add a telegram or discord entry under
presence.platforms in your entity YAML. Tokens come from environment variables named by
token_env (commonly TELEGRAM_TOKEN / DISCORD_TOKEN in
.env).
You can restrict who may talk to the bot with optional allowlists
(allowed_user_ids, allowed_chat_ids on Telegram). Start the entity with
bumblebee run so those platforms connect.
Why use S3-compatible storage for attachments?
In local mode, images and audio from chats are written to a disk folder under your user data path. On ephemeral cloud disks (e.g. a Railway worker), those files would disappear on redeploy.
Set BUMBLEBEE_ATTACHMENTS_BACKEND=object_s3_compat and the
BUMBLEBEE_S3_* variables so blobs land in object storage (any
S3-compatible API). bumblebee setup can prompt for this on the hybrid path.
What is knowledge.md?
Each entity can have a knowledge file you edit on disk. Sections marked
[locked] are yours only; unlocked sections can be updated by the entity over time.
Use bumblebee knowledge <entity> to create or open it in your editor.
What about Firecrawl or other optional APIs?
Bumblebee does not require paid web APIs. If you add a
Firecrawl API key, the harness can prefer it for richer fetch_url /
search_web behavior when configured — entirely optional.
What license is Bumblebee under?
Bumblebee is open source under the Apache License 2.0 — usable for personal and commercial projects, subject to that license’s terms.
Where can I ask questions or report issues?
Use GitHub Discussions for community questions and Issues for bugs. The readme is the best starting point for deeper documentation.