Friday runs on your own computer and talks to you through Telegram. It remembers things, checks in first, and handles the small stuff — so you don't have to.
You talk to Friday the way you'd text a friend. Voice notes, links, questions, photos. It reads the news you care about, jots things down, runs errands in the background, and writes back when something matters.
No website. No app. No account. One terminal window on your own machine — when you close the lid, it keeps working.
Friday is a single model (Opus 4.7) that can use tools by itself. You tell it how to behave in a plain text file (CLAUDE.md), then it handles the rest — chatting, remembering, scheduling, acting.
The model stays the same. The assistant evolves.
The assistant isn't bolted on top of something. It is the something.
Friday is built and proven on Claude Code with Anthropic's Opus 4.6 and 4.7. That's the path with the most polish today — every cron, MCP plugin, and self-evolving subsystem is verified end-to-end on it.
The architecture isn't tied to a single vendor, though. The brain is just an agentic CLI that can use tools, schedule jobs, and read a system prompt. With small adjustments to CLAUDE.md and the cron prompts, the same setup runs on:
Every agent has its own quirks — tool-use schemas, scheduling primitives, MCP support — so adapting is a real exercise, not a one-line swap. The blueprint and the lessons are here; the port is up to you.
Friday is a personal project kept public so others can learn from it or fork it. If it saved you time, star the repo — it's the lightest signal that this kind of work is worth continuing.
★ Star on GitHubA self-evolving AI assistant that runs 24/7 on a standard Windows, Linux, or macOS machine. It communicates via Telegram, runs scheduled tasks autonomously, manages files and projects, and maintains persistent memory across sessions.
It also learns from its own behavior: acquires new skills, reflects on daily performance, infers user preferences from repeated corrections, and proposes its own improvements. Powered entirely by Claude Code on the $100/month Max Plan — no custom AI backend, no fine-tuned models, no orchestration framework. The only custom code is a lightweight Flask server for persistent memory and the self-evolving subsystems.
Claude Code sits at the center, connecting to external services through MCP (Model Context Protocol) plugins and shell tools. A CLAUDE.md file acts as the system prompt, defining behavior, available tools, and cron schedules.
User (Telegram) ---> Claude Code (with MCP plugins)
|
|--> Memory API ---> SQLite (conversations, memories, embeddings)
|--> Self-Evolving ---> Skills, reflections, preferences, world model
|--> Knowledge Base ---> Notes, wiki, structured data (Notion MCP)
|--> GitHub ---> Repos (push, commit, PR)
|--> Voice API ---> TTS / STT (ElevenLabs)
|--> Email (MCP) ---> Send, receive, forward (AgentMail)
|--> Web Search/Fetch ---> News, research, data
|--> Cron system ---> Recurring autonomous jobs
|--> Local tools ---> Shell, scripts, system utilities
Agent frameworks add custom runtimes, orchestration code, deployment pipelines, and often their own API costs. Claude Code already is the runtime. It has native tool use, MCP plugin support, cron scheduling, sub-agent spawning, file I/O, git, and shell access built in. There is no glue code between the LLM and the tools.
One plan, one CLI, one model — and let the model do what it was designed to do.
| Brain | Claude Code CLI (Opus 4.7, 1M context) |
| Interface | Telegram (via MCP plugin) |
| Memory | Flask + SQLite + embedding-based RAG |
| Knowledge | Notion (via MCP plugin) |
| Voice | ElevenLabs TTS / STT |
| Scheduling | Claude Code built-in cron system |
| Cost | $100 / month (Anthropic Max Plan) |
The system runs 18 autonomous cron jobs that keep it alive and learning — 10 original plus 8 added by the v2 harness. The heartbeat and briefing crons act as watchdogs, verifying all jobs are active and recreating any that are missing.
| 1 | Email check | every 1h | inbox scan + notify |
| 2 | Cron watchdog | every 6h | verify no crons expiring |
| 3 | Daily briefing | daily ~9am | weather, markets, news, proposals |
| 4 | Heartbeat | every 1h | health + social check-in |
| 5 | Monthly usage | end of month | api usage report |
| 6 | Reflection | every 12h | review logs for patterns |
| 7 | Preference learning | daily (night) | infer rules from feedback |
| 8 | AI model monitor | daily 10:17 | new releases + AGI forecast |
| 9 | Memory API health | every 3h | auto-restart on failure |
| 10 | Weekly summarization | sunday | compress old logs |
| 11 | Goal prioritizer (v2) | daily 9:37 | flag stuck / near-deadline goals |
| 12 | Memory decay (v2) | sunday | confidence half-life on beliefs |
| 13 | Daily metrics (v2) | daily 22:23 | hallucination + calibration |
| 14 | Predictions resolver (v2) | daily 21:53 | close out past-due predictions |
| 15 | Skill promotion (v2) | daily 02:37 | draft → beta → stable |
| 16 | Experiments runner (v2) | every 6h | A/B via sandbox dry-runs |
| 17 | World model grower (v2) | daily 06:53 | detect recurring topics |
| 18 | Auto-audit (v2) | 3x / day | integrity scan of core tables |
A single Flask + SQLite server handles conversation logging, long-term memory, entity tracking, key-value storage, and RAG with vector embeddings. Embeddings are stored as BLOBs in the same SQLite file — no external vector database.
/graph serves Graph, Logs, Architecture, and RAG tabs (D3.js force-directed)/backup/export and /backup/import swap whole-DB snapshots (v2.9.0+)
Five subsystems work together to make the assistant improve over time:
All of this runs on the same memory server with no additional infrastructure. Visible in the Memory Graph's Brain tab.
v2 update — April 2026: extended with a full cognition harness: goal engine, hierarchical plans, three-layer memory, causal world model, verifier/sandbox, experiment engine, skill compiler, and a metrics framework. 13 new tables, ~70 endpoints, 8 new crons — all additive.
The entire system runs on a single $100/month Anthropic Max Plan. No cloud VMs running inference. No LangChain, no AutoGPT, no agent framework. Just Claude Code on a Linux machine with MCP plugins.
The key insight: Claude Code is not just a coding assistant — it is a general-purpose autonomous agent runtime. Give it tools, instructions, and a schedule, and it becomes a full 24/7 assistant.
claude --channels plugin:telegram@claude-plugins-official --dangerously-skip-permissions
Claude Code reads your CLAUDE.md, connects to Telegram, creates all cron jobs, and starts running autonomously.
Before the harness, Friday could already talk, remember, schedule, and reflect — but all of that was reactive. The system waited for a message, did what it was told, and went quiet. There was no persistent goal. No plan tree. No separation between I think this and I verified this. No A/B tests. No metrics proving improvement.
The harness is a thin, entirely additive cognition layer that gives Friday the scaffolding to act as if it were trying to get better — and the receipts to check whether it is.
goal → hypothesis → plan → execution → verification → reward → model update
| Goal engine | persistent intentions with utility, deadline, subgoals |
| Planner | hierarchical plan trees: goal → action → tool → rollback |
| Self-knowledge | capabilities + 6-rung autonomy ladder |
| Three-layer memory | episodic / semantic / procedural + provenance + decay |
| Causal world model | entities, relations, events, testable predictions |
| Safety | verifier + sandbox (dry-run before live) |
| Learning | experiments (A/B) + skill compiler (draft → beta → stable) |
| Metrics | 11 KPIs measuring actual improvement |
13 new tables, ~60 new endpoints. Every change is additive — no existing table was dropped, no endpoint broken.
Each non-trivial task becomes a first-class row with utility, deadline, constraints, success criteria, subgoals, risk tier, and autonomy level. GET /goal/next ranks by utility × urgency × (1 − progress). A daily cron at 09:37 flags anything past deadline or stalled >5 days.
Plans are executable trees, not text. Each node carries node_type, tool, expected_result, exit_condition, and rollback.
Every belief records provenance. A weekly decay job halves confidence on anything unverified. Verifying a row resets the clock.
The capabilities table gives Friday a live self-portrait: confidence (Bayesian blend), success/failure counts, cost and time averages, autonomy_max. On top sits a 6-rung autonomy ladder:
| L0 | Suggest only | — | propose but don't act |
| L1 | Sandbox | — | dry-run only |
| L2 | Low-risk act | — | reversible actions |
| L3 | Bounded act | — | with stop conditions |
| L4 | Long chain | — | with checkpoints |
| L5 | Self-modify | — | rollback required |
POST /autonomy/check gates every risky action. No unrecorded jump of autonomy.
Every important claim is logged with a check_type (factual, consistency, hallucination, evidence). Every irreversible action (email, code push, spending) must first run as dry-run with a verdict before graduating to live.
Without evidence, don't act. Without verification, don't learn from that action as if it were correct.
The experiment engine measures cause and effect with guardrails — conclusions are only drawn if delta > threshold AND samples > threshold. Otherwise: inconclusive.
Skills have maturity gates: draft → beta (one recorded run), beta → stable (≥ 3 runs, ≥ 66% success), stable → deprecated (< 50% over last 10). A nightly 02:37 cron applies the rules automatically.
A daily 22:23 cron computes values. /metric/summary returns latest + 7-day min / avg / max.
~/.claude/cron-prompts.md. Each disk prompt shows sincronizado if the runtime has it, ⚠ no corriendo otherwise — the signal to recreate.No unrecorded autonomy. Every operational decision — a goal created, a plan executed, an action sandboxed, a prediction resolved, a skill promoted — leaves a row. The dashboard is where a human audits whether the system is earning its autonomy, one row at a time.