Build Your Own Agentic OS: Phone, Pi, or MacBook in 2026
Three Claude Code stacks compared — phone via web UI, Raspberry Pi headless, MacBook power-user. Pick the one that fits your workflow.
Simon Willison spent the last week of April writing software while camping. Not on a laptop — on his iPhone. The implementation for his new Agentic Engineering Patterns guide was, in his own words, "almost all written by Claude Opus 4.6 running in Claude Code for web — accessed via my iPhone." That detail buried in his Substack post is the most important sentence of the week, because it ends a debate the agentic-IDE crowd has been hedging for two years: do you actually need a desktop development environment to ship serious software in 2026?
The answer, on Sunday May 3rd, 2026, is no. You need a Claude Code plan, somewhere to run an agent loop, and a clear idea of which tier of pain you want to optimize for. That's the whole stack.
This article is a decision tree. We'll walk through the three setups that have crystallized in the last 30 days — phone, Pi, MacBook — explain who each one is for, where the rate-limit cliffs live, and what the community has actually shipped on each. By the end you should know which DIY agentic OS to build this weekend, and which two to skip.
The four pillars of an agentic OS — what we're actually building
Before the hardware tier, the software tier. Three different creators converged on the same definition this week.
Simon Scrapes' hour-long walkthrough frames an agentic OS as four capabilities: persistent memory that survives between sessions, self-improving skills whose outputs get better based on past runs, scheduled workflows that run on a timer, and shared business context as a single source of truth. Chase AI compresses this into three steps: skills, memory, scheduler. MindStudio's writeup adds the most useful primitive of all — a learnings.md file attached to each skill, where Claude appends what it noticed about output quality after every run, and reads first before the next run. That's the entire self-improvement mechanism. No vector store, no fine-tuning, no agent framework. A markdown file.
The simplest possible agentic OS is: a directory of skills, a MEMORY.md, a learnings.md per skill, and a cron job. Everything else is decoration.
That definition is what the three hardware tiers below are all running. The interesting question isn't what you're running — it's where it runs when you're not babysitting it.
Tier 1: The Phone — Claude Code Web UI
Who it's for: Solo builders, side-project tinkerers, people who travel, anyone whose laptop time is rationed by family or job.
The proof point: Willison shipped iNaturalist photo integration into his blog timeline from his iPhone — a non-trivial feature involving image processing, scheduled jobs, and his existing PostgreSQL schema. He did it while running errands. Same pattern: a Git scraper to publish GitHub Actions workflow versions, written between stops.
The mechanism: Claude Code for web is Anthropic's asynchronous coding agent — their answer to OpenAI's Codex Cloud and Google's Jules. You give it a repo and a task. It runs in a sandbox, writes code, runs tests, commits, opens a PR. You read the PR on your phone, comment, merge. You never touched a terminal.
The trick: The killer feature isn't the editing surface (it's still cramped on a phone screen). It's that you can dispatch work and walk away. Builder.io's writeup frames it as "fire-and-forget development" — you stage 4-5 tasks before bed, wake up to 4-5 PRs to review with coffee. The phone's small screen forces you into the right habit: clear task descriptions, atomic changes, agent-driven verification.
Where it breaks: Anything that requires running a long-lived local service (a Postgres instance, a webhook receiver, a cron daemon) is out of scope. The web sandbox spins up per-task and doesn't persist state. You also lose access to your local secrets and the dotfiles you've spent years tuning. If your mental model of "coding" includes tmux, this tier will feel claustrophobic.
Cost: Claude Code Pro at $20/month covers it. Heavy users will hit the weekly cap and either upgrade to Max ($100 or $200/month) or pair with the rate-limit workaround we'll describe in Tier 3.
Verdict: If you ship features measured in hours (small CRUD additions, content automation, blog tooling, scrapers), the phone tier is genuinely sufficient. A year ago this sentence would have read like a joke. It doesn't anymore.
Tier 2: The Raspberry Pi — Always-On Headless Agent
Who it's for: Tinkerers who want a 24/7 personal automation backbone — content pipelines, monitoring, scheduled scrapes, smart-home glue. Anyone whose laptop closes at night but whose work shouldn't.
The proof point: David Ondrej's Pi Agent is a self-modifying agent built on top of OpenClaw, running on a $80 single-board computer. The Raspberry Pi Foundation officially endorsed the pattern in February: "always-on, energy efficient, quietly doing in the background." Armin Ronacher's writeup — Pi being Mario Zechner's minimal agent that powers OpenClaw — gives you the architectural skeleton.
The mechanism: A Pi 5 (8GB RAM is the practical minimum) running Raspberry Pi OS Lite, no desktop. SSH in to set it up, then leave it. Install OpenClaw with openclaw setup --headless. Paste your Anthropic API key. The Pi becomes an orchestration layer: it doesn't run LLMs locally — it manages tool calls, channel integrations (Telegram, Discord, Slack), and workflow automation, while delegating reasoning to the Anthropic API.
The hardware checklist (per Fast.io's deployment guide):
- Pi 5, 8GB RAM
- NVMe SSD via the official M.2 HAT (under $30) — SD cards bottleneck on OpenClaw's frequent SQLite writes
- Active cooling (the fan kit, not just the heatsink)
- A wired ethernet drop, not WiFi, if you want it stable for months
The cost: ~$120 in hardware once. ~$4/year in electricity (the Pi 5 draws ~6W under typical load). Anthropic API costs are pay-as-you-go and depend entirely on what you have it doing — most personal-automation use cases run $5-30/month.
Where it breaks: The Pi can't run anything compute-heavy locally. If you want local-LLM inference, you need a Mac mini or a dedicated workstation — not this tier. It's also fragile when SD card filesystems corrupt on power loss, which is why everyone now puts NVMe on the M.2 HAT.
The pattern that makes the Pi tier sing: Telegram + OpenClaw + Claude Code on the Pi. You DM your agent from anywhere — "summarize today's news," "scrape this competitor's pricing page and diff it against last week," "ship the staging deploy." The Pi receives, runs, replies. Closest thing to "having a personal employee" consumer hardware has produced.
Verdict: If you have any workflow that should run on a timer or respond to events while you sleep, the Pi tier is the highest-leverage purchase in personal computing right now. It's the tier most people skip — they buy a Mac mini for "always-on," then realize a Pi does the same job for $1,200 less.
Tier 3: The MacBook — Power-User Stack
Who it's for: Engineers who already live in a terminal. People with multiple repos, real codebases, and a workflow that includes testing, debugging, and committing across the day.
The proof point: The two top posts on r/ClaudeAI this week — "Giving Claude access to my MacBook be like" (1,243 upvotes) and "I gave Claude Code a $0.02/call coworker to stop hitting Pro limits" (1,330 upvotes) — both come from this tier. The first is the meme version of the pattern: hand the agent your laptop and watch it touch everything. The second is the optimization on top of it: pair Claude Code with a cheap secondary model (Gemini Flash, DeepSeek-V4, Qwen 3.6) that handles grunt work — formatting, simple refactors, file shuffling — so Claude only sees the hard turns.
The mechanism: Claude Code CLI installed locally. Your existing dotfiles, secrets, dev environment. A .claude/ directory with skills, slash commands, and MCP servers. Cron jobs or launchd agents for scheduled work. The full mental model of "this is my machine" preserved.
The four-pillar implementation, MacBook edition:
- Skills:
~/.claude/skills/— markdown files with YAML frontmatter, one per task type. Self-improving via thelearnings.mdpattern. - Memory:
MEMORY.mdat the workspace root, plus per-daymemory/YYYY-MM-DD.mdrolling notes. Curated by the agent, read by every session. - Scheduler: macOS
launchdplists, or a scheduled-agents pattern if you want them to run when the laptop is closed. - Context: A single
CLAUDE.mdat every repo root, plus a global~/.claude/CLAUDE.mdfor cross-project preferences.
The rate-limit reality: This is where the $0.02/call coworker pattern shows up. Claude Pro at $20/month gives you ~10-45 messages per 5-hour rolling window with Sonnet. Power users blow through that by lunchtime. Three workarounds, in order of how the community is actually using them:
- Upgrade to Max ($100 or $200/month). Solves it. Expensive.
- Hybrid model routing. Use Gemini Flash or DeepSeek-V4 for cheap turns, Claude only for hard turns. Tools like
cc-switchor openclaw's commit-surcharge router automate this. - Web UI fallback. When the CLI is rate-limited, Claude Code for web has a separate weekly cap. You can keep working from the browser tab on the same laptop.
Where it breaks: The MacBook tier dies when the lid closes. If your workflows need to run while you sleep or while you're in meetings, you're back to needing a Pi. The rational stack is both — MacBook for active work, Pi for ambient automation, phone for fire-and-forget. (Yes, that's all three tiers. We're not pretending otherwise.)
Verdict: If you're already a Claude Code daily user, you already live here. The interesting question isn't whether to use this tier — you do — but how to layer Tier 2 underneath it.
The decision tree
| Question | If yes → | If no → |
|---|---|---|
| Do you ship code professionally on a Mac/Linux machine? | Start with Tier 3 | Skip to next |
| Do you have ambient workflows (cron jobs, monitoring, bots) that should run 24/7? | Add Tier 2 | Skip |
| Do you do small builds in spare moments — commute, errands, weekends? | Add Tier 1 | Skip |
| Do you only do one of those three? | Just that tier — don't over-build | — |
| Are you starting from zero with no laptop preference? | Tier 1 + Tier 2. Skip the MacBook until you outgrow them. | — |
The "starting from zero" answer surprises people, but it's correct. A used iPhone and a Raspberry Pi 5 are $400 of hardware that do 80% of what a $2,000 MacBook does for personal-automation work. The remaining 20% — debugging hairy concurrency bugs, comparing performance traces — you can rent a cloud dev environment for when you need it.
Community Reaction — what's hitting this week
The r/ClaudeAI top posts capture both the excitement and the friction.
The "$0.02 coworker" thread (1,330 points) is the practical handbook for Tier 3 users. The original poster wired Gemini Flash as a "junior engineer" agent that handles file moves, formatting passes, and simple lint fixes — only escalating to Claude when something actually requires reasoning. Two highlights from the comments: one user reports going from hitting Pro limits at 11am to finishing the day with Pro headroom intact; another points out that the same pattern works in reverse — Claude as the planner, multiple cheap models as parallel executors.
The "Giving Claude access to my MacBook be like" thread (1,243 points) is the satirical version, but the comments are serious: the consensus is that the right MCP-server posture is "least-privilege by default, expand on demand." The OpenClaw model — Claude can install software, run programs, integrate with services — only works when the agent is sandboxed. On a laptop, that means MCP servers scoped to specific directories, not the whole filesystem.
On the YouTube side, Chase AI's Top 10 NEW Open-Source Claude Code Tools and Simon Scrapes' "Creating Your Own Agentic OS is Easy" both cleared 100K views in their first week. The comments echo the same point: people aren't stuck on tooling anymore — they're stuck on knowing which of the available patterns is right for their workflow.
That's the gap this article is trying to close.
How to get started this weekend
If you have nothing: Buy a Pi 5 8GB kit ($120 with NVMe HAT and case). Order an Anthropic API key. Follow the official Raspberry Pi guide — it's 30 minutes of apt install and SSH. By Sunday night you have an always-on agent on your network. Add Telegram or Discord bot integration on Monday.
If you have a Mac: Install Claude Code CLI if you haven't. Create ~/.claude/skills/ with one skill (the canonical first one is "summarize today's PRs" — it's small, valuable, and makes the feedback loop visible). Add a learnings.md to it. Schedule it via launchd for 6pm daily. You now have a working four-pillar OS in a single afternoon.
If you only have a phone: Sign up for Claude Pro. Connect a GitHub repo to Claude Code for web. Open a feature task tonight, walk away, review it tomorrow morning. If the PR is solid, you've validated the pattern — go bigger.
If you have all three: Layer them. Phone for dispatch, Pi for ambient, Mac for deep work. The seams between them are where the next 12 months of agentic-OS evolution will happen — and you want to be standing on those seams when the patterns harden.
The agentic-OS era is here. The question is no longer whether to build one. It's which tier — and which two — you're going to commit to this weekend.
For deeper context on the underlying tools, see our agentic dev stack guide, our analysis of scheduled agents that run without a local machine, and the remote Claude Code tasks pattern that pairs naturally with the Pi tier.
About ComputeLeap Team
The ComputeLeap editorial team covers AI tools, agents, and products — helping readers discover and use artificial intelligence to work smarter.
💬 Join the Discussion
Have thoughts on this article? Discuss it on your favorite platform:
Related Articles
Codex /goal Just Ate the Agent-Harness Category
OpenAI's Codex CLI shipped a /goal loop that absorbs the harness category overnight. Here's what survives — skills, memory, and tool integrations.
Harness Leaderboards Are the New Model Leaderboards
Dirac took Gemini 3 Flash from 47.8% to 65.2% on TerminalBench — a +17pp swing from harness alone. Why model leaderboards miss the real story.
Claude Kills SaaS: The Frame, the Receipts & 3 Playbooks
How 'Claude kills SaaS' jumped 8 channels in 90 days — plus the founder, investor, and defensible-SaaS playbooks that follow from it.
The ComputeLeap Weekly
Get a weekly digest of the best AI infra writing — Claude Code, agent frameworks, deployment patterns. No fluff.
Weekly. Unsubscribe anytime.