Documentation Index
Fetch the complete documentation index at: https://docs.openhome.com/llms.txt
Use this file to discover all available pages before exploring further.
Designing OpenHome Abilities
A manifesto for voice-first ambient intelligence.
Philosophy - Frameworks - Sound Design - Lifecycle - API Integrations - OpenClaw Bridge - 170+ Ability Ideas
This is the long-form single-page manifesto. The same material is also organized into focused pages for easier reference:
Part 1: Philosophy
1.1 - The Core Premise
- You are not building an app. You are building a presence in a room.
- A smart speaker is a microphone, a speaker, and a brain that never sleeps.
- The best Ability is the one the user forgets is running - until it does something so well-timed they think: “How did it know?”
Key insight: It knew because it was there. Listening. Learning. Waiting.
1.2 - The Three Modes of Operation
| Mode | What It Does | Key Principle |
|---|
| Listening | Captures ambient audio, transcribes speech, identifies speakers, detects sounds, extracts meaning | The user may not even be talking to the device |
| Speaking | Interjects, responds, narrates, coaches, entertains | Voice is expensive. Every word is a second the user cannot skip. Silence is often better. |
| Logging | Writes to persistent backends, companion apps, dashboards silently | Accumulates intelligence over hours, days, and weeks. The most powerful layer. |
1.3 - When Something Should Be an Ability
If the LLM can handle it with a Agent prompt alone, it does not need to be an Ability.
Abilities exist for things the LLM cannot do on its own:
- Call an external API
- Play or generate audio
- Control a physical device
- Persist data across sessions
- Run multi-step workflows with branching logic
- Access real-time data (weather, scores, stocks, calendars)
Pro tip: Ask yourself: “Does this require reaching outside the LLM?” If yes, it is an Ability.
Part 2: Design Frameworks
2.1 - The Three Ability Archetypes
| Archetype | Behavior | Examples |
|---|
| The Responder | Mostly silent. User initiates. Speaker answers, then exits. | Weather, timer, WiFi password, quick lookup |
| The Companion | Active participant in ongoing back-and-forth. Has agent and memory. | Debate coach, recipe walkthrough, brainstorm partner, bedtime story |
| The Observer | Mostly silent. Listens, transcribes, analyzes, logs, surfaces insights later. | Life logger, meeting transcriber, sleep tracker, dream decoder |
Pro tip: The Observer archetype is underused and extremely powerful. Silence is the feature.
2.2 - Ten Design Frameworks
- The Invisible Worker - Handles tedious labor users would never manually maintain.
- The Information Funnel - Compresses many dashboards and apps into one timely spoken sentence.
- Surprise Artifact Generation - Builds over time and delivers meaningful outputs unexpectedly.
- The Emotional Radar - Adapts behavior based on how users sound, not only what they say.
- The Daily Ritual Anchor - Attaches to existing habits, not net-new behaviors.
- The Compound Intelligence Loop - Gets smarter over weeks; value compounds over time.
- The Proxy Agent - Acts for the user (send, book, reorder), not only informs.
- The Social Multiplier - Designs for rooms with multiple people.
- The Context Mesh - Weaves multiple sources into contextual intelligence.
- The Graceful Silence Principle - Define silence rules first; speak less than possible.
Part 3: Voice-First Design Rules
3.1 - Keep It Short
- Keep each
speak() to 1-2 sentences.
- Lead with the headline.
- Use progressive disclosure.
Example: “You have 3 meetings. Next is at 2 with Sarah. Want the full list?“
3.2 - Fill the Silence
- If an API call takes over 1 second, speak first.
- Example fillers: “One sec, pulling that up.” “Hang on, checking.” “Let me look into that.”
- Dead silence feels broken.
3.3 - Confirm Before Acting
- Destructive or high-stakes actions need confirmation.
- Example: “Cancel Team Standup? Say yes to confirm.”
- Low-stakes lookups can run directly.
- Transcription is messy.
- Use the LLM to extract clean intent.
- If parsing fails, ask again explicitly.
3.5 - Handle Exits
- Looping abilities need exit words:
done, stop, bye, nothing else, I'm good.
- One idle cycle: keep going.
- Two idle cycles: offer to leave.
- Call
resume_normal_flow() on every path.
3.6 - Spell It Out
- TTS can mangle emails, URLs, and numbers.
- Say “at” for
@, “dot” for ..
- Read phone numbers digit by digit.
3.7 - Silence Is a Feature
- Do not respond to every moment.
- Log interesting details silently.
- Do not read more than three items without asking.
Part 3B: Sound Design - Audio as Interface
Voice abilities are audio experiences, not only speech.
Sound Effect Types
| Type | When to Use | Example |
|---|
| Confirmation Tones | Action completes successfully (low stakes) | “Lights off” -> soft click |
| Transition Sounds | Switching modes or states | Entering ability -> short whoosh |
| Intro Music/Themes | Companion or game abilities | Trivia -> game show sting |
| Feedback Beeps | Correct/wrong, milestones, timers | Correct -> bright pip, wrong -> low tone |
| Ambient Audio | Atmosphere under speech | Focus mode -> low lo-fi; sleep -> rain |
| Alert/Interrupt | Background interruptions | Timer done -> escalating soft alarm |
Sound Design Principles
- Less is more.
- Consistency builds trust.
- Time-of-day awareness is mandatory.
- Let sounds replace words over repeated usage.
Key insight: Over time, sound can replace spoken confirmations as users learn the audio language.
Sound Anti-Patterns
- Sound effects on every
speak().
- Long intros that delay useful speech.
- Loud nighttime alerts.
- Alarm-like sounds that induce panic.
- Loops that clash with speech.
- Inconsistent sounds for the same action.
Part 4: Trigger Word Design
4.1 - Think in Speech, Not Text
Users say: “what’s on my calendar”, “do I have a 3pm”, “am I free Tuesday”.
4.2 - Balance Coverage vs False Positives
| Trigger Type | Risk | Strategy |
|---|
Safe single words (calendar, weather) | Low | Use freely |
Dangerous single words (book, free, cancel) | High | Prefer phrases |
Phrase triggers (book a time, am I free) | Medium-Low | Strong default |
| Full sentence triggers | Low | Capture indirect phrasing |
4.3 - Trigger Checklist
- Include plural forms.
- Include regional variants.
- Include indirect phrasing.
- Include natural full sentences.
4.4 - Read Trigger Context
Use prior conversation to classify intent and route correctly:
- “What’s on my calendar today?” -> daily schedule.
- “Create a meeting with Sarah at 3” -> direct create flow.
Pattern: read history -> classify intent -> route handler.
Part 5: The Ability Lifecycle
background.py background daemons now run alongside interactive ability flows.
5.1 - Two Runtime Lifecycles
Interactive Skill / Brain Skill path
- User or brain routing activates
main.py.
- Main flow calls
call(self, worker).
- Ability runs interaction logic.
- Ability exits with
resume_normal_flow().
Background Daemon path
- Session starts.
- Platform auto-starts
background.py (no hotword).
- Main flow calls
call(self, worker, background_daemon_mode).
- Daemon runs a continuous
while True loop for the session lifetime.
5.2 - Ability Categories
| Category | Behavior |
|---|
| Skill | Standard user-triggered ability. Hotword -> flow -> resume_normal_flow(). |
| Brain Skills | Triggered by the Agent brain to fill knowledge gaps or delegate actions. |
| Background Daemon | Starts automatically at session start and runs continuously, including during sleep mode. |
| Local | Runs on DevKit hardware with direct access to hardware and sandbox restricted Python libraries via devkit_functions.py. |
Note: Brain Skills templates are still being finalized.
5.3 - Ability File Structures
| Pattern | Files | Behavior |
|---|
| Standard Interactive | main.py | Triggered by user or brain routing, then exits to main flow. |
| Standalone Background Daemon | background.py | Runs continuously for monitoring/logging/scheduling. |
| Interactive Combined | main.py + background.py | Foreground user flow plus background daemon coordination. |
background.py must be named exactly background.py or it will not be detected.
5.4 - Critical main.py vs background.py Differences
| Aspect | main.py | background.py |
|---|
call() signature | call(self, worker) | call(self, worker, background_daemon_mode) |
| Trigger | User hotword or brain routing | Automatic on session start |
| Lifecycle | Run once, then exit | Continuous loop |
resume_normal_flow() | Required on exit paths | Not used in daemon loop |
| Sleep mode | Not active while asleep | Keeps running while Agent sleeps |
5.5 - Combined Pattern (main.py + background.py)
- User says “set an alarm for 3 PM Thursday.”
main.py parses intent and writes schedule data to alarms.json.
main.py confirms and exits via resume_normal_flow().
background.py polls alarms.json on an interval.
- At trigger time, background calls
send_interrupt_signal(), then plays/speaks alert.
- Background updates alarm status to triggered.
5.6 - Background Daemon Best Practices
- Use
session_tasks.sleep() instead of asyncio.sleep().
- Keep poll intervals reasonable (typically 10-30 seconds).
- Handle missing files gracefully (
check_if_file_exists() first).
- For JSON updates, delete then write full content.
- Log heavily with
editor_logging_handler.
- Call
send_interrupt_signal() before daemon speak() or play_audio().
- Keep a never-ending
while True loop for sleep-mode continuity.
5.7 - The ability.md Pattern
Each ability should include ability.md with YAML frontmatter and markdown body.
Critical: description is the primary trigger field for system routing.
Part 6: Ability Ideas by Location
6.1 - Nightstand (Bedroom)
- Morning Manifest
- Lights Out Debrief
- Tomorrow’s Weather Whisper
- Bedtime Story Engine
- Midnight Worry Jar
- Gratitude Fade-Out
- Morning Body Check
- Dream Catcher
- Sleep Debt Tracker
- Power Nap Coach
6.2 - Living Room (Couple)
- Settle It
- Movie Matchmaker
- Dinner Decider
- Couple’s Trivia
- The Argument Cooldown
- Weekend Planner
- Guest Mode
- Anniversary Vault
- Background Narrator
6.3 - Kitchen
- Recipe Walkthrough
- Grocery List Builder
- Cooking Timer Orchestrator
- Kitchen Radio DJ
- Sous Chef Advisor
6.4 - Conference Room
- Decision Logger
- Action Item Extractor
- Meeting Recap
- Who Talked Most
- Pre-Meeting Briefer
- Follow-Up Drafter
- Agenda Enforcer
- Cross-Meeting Intelligence
6.5 - College Dorm
- Study Pomodoro Coach
- Exam Countdown
- Cram Session Quiz Master
- Budget Buddy
- Wake Up Enforcer
- Roommate Mediator
6.6 - Home Office
- Focus Guardian
- Standup Generator
- Meeting Prep Briefer
- End-of-Day Wrap
6.7 - Car / Commute
- Commute Debrief
- Hands-Free Messenger
- Traffic Aware ETA
- Errand Optimizer
Part 7: Ability Ideas by User
7.1 - Kids (Ages 5-12)
- Homework Helper
- Would You Rather
- Animal Expert
- Story Builder
- Spelling Bee Coach
- Mystery Detective
7.2 - Kids Games (Ages 8-10)
- Boss Battle Trivia
- Monster Collector
- Speed Round
- Dungeon Crawler
- Conspiracy Board
7.3 - Parents
- Baby Sleep Tracker
- Toddler Vocabulary Tracker
- Family Calendar Sync
- Bedtime Routine Manager
7.4 - Elderly Users
- Medication Reminder
- Cognitive Wellness Check
- Family Connection
- Daily Companion
7.5 - Professionals
- Executive Brief
- Sales Call Scorer
- Client Meeting Debrief
Part 8: Ability Ideas by Use Case
8.1 - Health and Wellness
- Mood Logger
- Guided Meditation Selector
- Breathing Exercise Coach
- Symptom Tracker
- Voice Health Scanner
8.2 - Productivity
- Inbox Zero Coach
- Voice-to-Task
- Weekly Review
- Voice Notes to Structured Docs
8.3 - Finance
- Portfolio Pulse
- Spending Tracker
- Trending Stocks
- Bank Balance Reality Check
8.4 - Entertainment
- Song of the Day
- Movie/Show Recommender
- Live Sports Companion
- Spotify Time Machine
- Price Watcher
- Grocery Auto-Order
- Package Tracker
- Gift Idea Collector
8.6 - Smart Home and IoT
- Scene Controller
- Morning Routine
- Security Check
Part 9: 3rd-Party API Integration
| Category | APIs | What They Enable |
|---|
| Music and Audio | Suno, ElevenLabs, Spotify, Podcast APIs | Song generation, voice, playback, discovery |
| Finance | Plaid, Alpha Vantage, Polygon.io, CoinGecko | Banking, stock prices, portfolio, crypto alerts |
| Calendar and Productivity | Google Calendar, Todoist, Notion, Gmail | Events, tasks, notes, triage |
| Communication | Twilio, Slack, Telegram, SendGrid | SMS, chat, email delivery |
| Media and Content | TMDB, YouTube, NewsAPI, Goodreads | Media discovery and summaries |
| Location and Travel | FlightAware, Google Places, Uber/Lyft, Ticketmaster | Flight, local, rides, events |
| Smart Home | Philips Hue, Nest, SmartThings, IFTTT | Device control and scenes |
| Health | Apple Health, Nutritionix, Headspace, Fitbit | Steps, calories, meditation, sleep |
| AI and Generation | OpenAI, DALL-E, Whisper, ElevenLabs SFX | LLM tasks, images, transcription, SFX |
| Niche | Astrology APIs, Spoonacular, SportRadar, GitHub | Domain-specific utilities |
Pro tip: The highest-value abilities often combine 2-3 APIs into one synthesized output.
Part 10: The OpenClaw Bridge
10.1 - What OpenClaw Is
- Locally running desktop AI agent with a large skill ecosystem.
- Can read files, run CLIs, and access local network resources.
- Has registry-based skills for smart home, finance, communication, media, code, logistics, and health.
10.2 - Why It Matters
OpenHome is sandboxed. OpenClaw can operate on-device. A bridge unlocks desktop-level agency through voice.
10.3 - What the Bridge Unlocks
| Category | Skills Available | Voice Bridge Example |
|---|
| Smart Home | Hue, IKEA, Nest, Tesla, Govee, Roborock | ”Turn off the living room lights” |
| Communication | WhatsApp, Slack, Telegram, email | ”Tell Mom I’ll be there at 6” |
| Media | Spotify, Plex, Jellyfin | ”Play Discover Weekly on living room speaker” |
| Productivity | Google Workspace, GitHub, Notion | ”What PRs need my review?” |
| Shopping | Amazon, grocery, price tracking | ”Order everything on my grocery list” |
10.4 - Flagship Bridge Abilities
- Smart Home Scene Controller
- Send Message by Voice
- Voice-Triggered Email Triage
- Tesla Voice Control
- GitHub Standup
- Voice Clone Creator
- Meeting Notes to Vault
- Document Generator
10.5 - Security
- Use permission tiers (read-only, write-confirmed, financial/messaging explicit confirmation).
- Send structured requests, not raw code.
- Vet registry skills carefully before installation.
Part 11: Combining Frameworks
| Combination | What It Creates | Example |
|---|
| Observer + Surprise Artifact | Passive intelligence producing unexpected documents | Anniversary Vault, Dream Dictionary |
| Proxy Agent + OpenClaw Bridge | Voice actions in the real world | Send WhatsApp, book rides, order groceries |
| Daily Ritual + Compound Loop | Habits that improve daily | Personalized morning briefing |
| Social Multiplier + Emotional Radar | Group experiences that adapt to room energy | Adaptive party trivia |
| Information Funnel + Context Mesh | One sentence from many data streams | Calendar + weather + traffic + mood |
| Invisible Worker + Graceful Silence | Background intelligence with selective speaking | Flight delay watcher |
Pro tip: Start with one primary framework, then add one secondary framework to 10x value.
Part 12: The Sci-Fi Frontier
These ideas are technically feasible today with ambient audio, diarization, extraction, and longitudinal logging.
- Relationship Autopsy - detect communication pattern shifts before conscious recognition.
- Voice Health Scanner - detect illness signatures from vocal micro-changes.
- Cognitive Decline Watchdog - monitor repetition and word-finding over months.
- Emotional Forecast - predict daily trajectory from morning voice plus context.
- Agent Drift Monitor - track long-term language and interest changes.
- Argument Predictor - identify precursors and intervene before escalation.
- Dream Decoder Network - connect sleep-talk themes with daytime context.
Insight: Voice is a biomarker; longitudinal speech can reveal patterns users do not explicitly report.
Part 13: Quality Checklist
Run this before shipping any Ability:
Part 14: The Brainstorm Catalog (170+ Ideas)
Format: Ability Name - Speaker Location - Example User - Description
14.1 - Daily Life and Routines
- Daily Song Generator - Living Room - 20s Woman - Suno-generated hype song summarizing the day.
- Morning Motivation - Nightstand - Entrepreneur - Reads goals and asks for one priority.
- Outfit Advisor - Bedroom - Professional - Weather plus calendar formality suggestion.
- Commute Launcher - Entryway - Office Worker - Traffic, ETA, and podcast queue.
- Arrival Debrief - Living Room - Parent - Welcome recap after returning home.
- Evening Wind-Down - Living Room - Couple - Lights, ambient music, reflective prompt.
- Weekend Kickoff - Living Room - Family - Friday activity suggestions from weather and preferences.
- Bedtime Closer - Nightstand - Anyone - Lock doors, set alarm, preview tomorrow.
- Caffeine Tracker - Kitchen - Coffee Addict - Tracks intake and sleep impact.
- Habit Streak - Any Room - Self-Improver - Daily check-ins and streak announcements.
- Dog Walk Tracker - Entryway - Pet Owner - Tracks walk cadence and weather-aware nudges.
14.2 - Work and Productivity
- Standup Bot - Home Office - Developer - Reads git plus calendar and drafts standup update.
- Email Sniper - Home Office - Executive - Voice triage on top email subjects.
- Focus Lock - Home Office - Writer - Blocks interruptions with optional white noise.
- Decision Journal - Home Office - Founder - Logs decisions and 30-day outcome reviews.
- Client Prep - Home Office - Salesperson - CRM context before calls.
- Idea Capture - Any Room - Creative - Timestamped idea logging by project.
- Pitch Practice - Living Room - Startup Founder - Timing and clarity feedback.
- Code Review Reader - Home Office - Developer - Reads PR comments aloud.
- Sprint Closer - Home Office - PM - Sprint summary and retro point generation.
14.3 - Finance and Money
- Spending Alarm - Kitchen - Overspender - Alerts when spend exceeds daily budget.
- Bill Countdown - Living Room - Budgeter - Weekly due bills summary.
- Impulse Blocker - Living Room - Shopper - Defers purchases and rechecks next day.
- Side Hustle Tracker - Home Office - Gig Worker - Logs earnings and monthly P and L.
- Subscription Audit - Living Room - Anyone - Monthly recurring subscription breakdown.
- Savings Goal - Living Room - Saver - Goal progress nudges.
- Crypto Morning Brief - Home Office - Trader - Overnight movers and activity summary.
14.4 - Health and Wellness
- Stretch Break - Home Office - Desk Worker - Two-minute mobility prompts every 90 minutes.
- Breathing Coach - Bedroom - Anxious Person - Tone-guided breathing pacing.
- Calorie Estimator - Kitchen - Dieter - Meal estimation via nutrition API.
- Symptom Log - Bedroom - Chronic Illness - Daily symptom tracking and weekly report.
- Allergy Alert - Kitchen - Allergy Sufferer - Pollen-aware outdoor warnings.
- Mental Health Check - Bedroom - Anyone - Weekly check-in with monthly patterns.
14.5 - Relationships and Social
- Date Night Planner - Living Room - Couple - Budget-aware restaurant and activity suggestions.
- Love Language Tracker - Living Room - Couple - Tracks expression balance over time.
- Friend Tracker - Living Room - Social Person - Nudges for neglected relationships.
- Party DJ - Living Room - Host - Guest requests and playlist control.
- Gift Brain - Any Room - Thoughtful Person - Year-round gift idea capture.
- Anniversary Countdown - Bedroom - Partner - Contextual reminders from past activities.
14.6 - Kids and Family
- Chore Quest - Living Room - Family - Gamified chores with XP and leaderboards.
- Vocabulary Builder - Kid’s Room - Student (8) - Word of the day with reinforcement.
- Math Duel - Living Room - Siblings - Competitive adaptive mental math.
- Joke of the Day - Kitchen - Family - Daily joke plus weekly best-of.
- Talent Show Host - Living Room - Family - MC flow with applause and scoring.
14.7 - Entertainment and Games
- Murder Mystery - Living Room - Dinner Party - Role assignment and clue progression.
- Rap Battle Coach - Bedroom - Teen - Freestyle prompts and judging.
- Sports Bar Mode - Living Room - Sports Fan - Live score narratives and alerts.
- DnD Dungeon Master - Living Room - Gamers - Campaign narration and NPC voices.
- Escape Room - Living Room - Couple - Voice puzzle scenarios with timer and hints.
- Debate Tournament - Living Room - Friends - Timed topics and AI judging.
14.8 - Smart Home and Environment
- Room Mood Setter - Living Room - Anyone - Scene orchestration with lights and climate.
- Leaving House Check - Entryway - Forgetful - Lock, lights, thermostat verification.
- Energy Coach - Living Room - Homeowner - Efficiency nudges from usage context.
- Guest Welcome - Entryway - Host - Door-aware welcome and privacy mode.
- Thermostat Negotiator - Living Room - Couple - Fair compromise between preferences.
14.9 - Creative and Maker
- Song of the Day - Living Room - 20s Woman - Personalized Suno track.
- Beat Maker - Bedroom - Teen - Vibe-to-beat generation.
- Sound Effect Studio - Any Room - Creator - On-demand SFX generation.
- Writing Prompt - Home Office - Writer - Genre-aware prompt creation.
- Remix My Day - Bedroom - Producer - Ambient track from day transcript.
- Mood Playlist - Living Room - Anyone - Mood-aware playlist generation.
14.10 - Background / Always-On
- Life Logger - Any Room - Reflective - Always-on ambient capture with daily summaries.
- Baby Monitor Plus - Nursery - New Parent - Cry detection and breathing/silence alerts.
- Meeting Scribe - Conference - Team - Auto-start with 3+ voices and real-time notes.
- Daily To-Do Compiler - Any Room - Busy Person - Captures “I need to” moments.
- Gratitude Harvester - Any Room - Anyone - Collects positive statements for weekly review.
- Dream Recorder - Bedroom - Dreamer - Sleep-talking capture into journal.
- Profanity Jar - Living Room - Family - Running tally with playful fines.
14.11 - Niche and Weird
- Wine Pairing - Kitchen - Foodie - Meal-to-wine recommendation.
- Dad Joke Engine - Kitchen - Dad - Endless joke generator with groan scoring.
- Plant Care - Any Room - Plant Parent - Species-specific care reminders.
- Hot Take Generator - Living Room - Friends - Debate-fueling spicy prompts.
- Life Narrator - Any Room - Anyone - Stylized narration mode.
- Compliment Machine - Bathroom - Anyone - Daily compliment on detection.
- Random Fact Cannon - Kitchen - Family - Timed obscure fact drops.
Closing Thought
The best Ability does something the LLM cannot, at a moment the user did not expect, using context accumulated over time, delivered in fewer words than the user would use.
Build for the room. Build for the moment. Build for the silence between words.
Then let the speaker do what it does best: be there.