Documentation Index
Fetch the complete documentation index at: https://docs.openhome.com/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
Custom Abilities are the cornerstone of extending OpenHome’s functionality. They allow developers to:- Add personalized features to AI agents.
- Integrate third-party APIs for dynamic interactions.
- Customize logic for enhanced user engagement.
- Structuring and registering an Ability.
- Using
CapabilityWorkerfor seamless I/O management. - Examples showcasing how to create powerful custom Abilities.
Adding an Ability
File Structure
Each Ability resides in its folder and requires amain.py file to define the logic.
Example File: main.py
Here’s a basic template for building a new Ability:
Key Components
#{{register_capability}}: is essential.call: Executes the Ability’s logic when triggered.
Custom API Keys (Third-Party Services)
If your Ability needs credentials for external services (for example OpenAI, SendGrid, or Twilio), configure those keys in the dashboard and read them at runtime usingget_api_keys("key_name").
Setup Flow
For Developers
Step 1 — Declare keys You can declare a custom API key in either of two ways:-
While creating/editing the Ability: Under Ability Behavior → API Keys, add each key by name and include a provider URL (required). The key value is not set here; values are always managed from Settings → API Keys.

-
From Settings: Go to Settings → API Keys → Third-party Keys and create the key directly, then link it to an Ability by creating or editing one in the Ability editor.


Important: Untagged keys will not trigger an install-time prompt for users.Step 3 — Read values at runtime
For Users
When installing an Ability from the marketplace, a pop-up lists all required keys with direct links to each provider. Users can enter values immediately or skip and add them later from Settings → API Keys. The Ability will not work until all required keys have values set.
Security note: Never hardcode secrets in your Ability code. Always read keys at runtime.
Understanding CapabilityWorker
The CapabilityWorker class simplifies I/O interactions, enabling:
- Speech synthesis: Using text-to-speech (TTS).
- Listening for user input: Capturing and processing responses.
- Running interaction loops: Supporting conversational flows.
CapabilityWorker Quick Reference
Use these functions directly onself.capability_worker.
Conversation
Speak text to the user using the configured TTS.True or False.
Text Generation
Return plain text from the model (no speech).File Helpers
Read a file.Note: Storage Scope Usage
- Use
in_ability_directory=Falsefor persistent user-level storage shared across abilities.- Use
in_ability_directory=Truefor ability-scoped data that should remain isolated within the ability session.
Context Storage (Key-Value)
Audio and Streaming
WebSocket / Device Actions
Using Specific Voice IDs for Text-to-Speech
The CapabilityWorker class supports the use of specific Voice IDs for text-to-speech (TTS) functionality. This allows you to customize the voice used for speech synthesis by specifying a Voice ID from the provided list.Available Voice IDs
You can use any of the following Voice IDs for TTS:text_to_speech Function
Thetext_to_speech function converts the provided text into speech using the specified Voice ID and streams it to the user via WebSocket.
Parameters
text (str): The text to be converted into speech.voice_id (str): The Voice ID to be used for speech synthesis.
Advanced CapabilityWorker Functions
Audio Processing Functions
TheCapabilityWorker provides comprehensive audio handling capabilities:
play_audio: Play audio content directly or file objectsplay_from_audio_file: Play audio files stored in the capability directorysend_audio_data_in_stream: Stream processed audio data over WebSocket
Text Generation Functions
Multiple options for text generation:text_to_text_response: Standard text generation with history and system promptsgenerate_ttt_using_openrouter: Alternate text generation using OpenRouterllm_search: Web-search-backed short answerllm_tools: Tool-calling with the model
Streaming and Communication
Advanced communication features:stream_initandstream_end: Manage audio streaming sessionssend_data_over_websocket: Send custom data over WebSocketsend_interrupt_signal: Interrupt ongoing output and hand control back to user inputsend_agent_message_without_audio: Send a text reply without TTSsend_devkit_action: Trigger a Devkit actionsend_devkit_mqtt_action: Trigger a Devkit MQTT action
Context and Session Helpers
get_timezone: Read the current user’s timezone for local-time-aware behaviorget_token: Read linked account access token for Google ("google"), Slack ("slack"), or Discord ("discord")get_api_keys: Read custom API key values from Settings → API Keys by key nameget_full_message_history: Read full session message history for context-aware responsesupdate_personality_agent_prompt: Append context/instructions to the Agent personality promptcreate_key/update_key/delete_key: Manage structured key-value context storageget_single_key/get_all_keys: Read one or all stored context entries
Recording and Local Audio
get_audio_recording: Load the latest user recording bytesget_audio_recording_length: Duration in seconds for the latest recordingflush_audio_recording: Clear the current recording before a new captureplay_from_audio_file: Play an audio file stored in the Ability directory
Session Task Utilities (replace raw asyncio usage)
To ensure Abilities run within the agent’s managed lifecycle, avoid using rawasyncio helpers directly.
- Use
self.worker.session_tasks.sleep(seconds: float)instead ofasyncio.sleep(...):
Background Daemon Entry Point (background.py)
Background daemons run automatically when a session starts. Use a separate background.py file with this entry signature:
- Keep daemon logic inside a continuous
while Trueloop. - Use
await self.worker.session_tasks.sleep(...)between cycles. - Do not call
resume_normal_flow()inside daemon loops. - Call
await self.capability_worker.send_interrupt_signal()before daemon speech/audio.
Example 1: Basic Capability
This Ability creates a daily life advisor that:- Asks the user for a problem: Initiates a conversation to gather user input.
- Provides advice: Offers a solution based on user input.
- Collects feedback: Captures user satisfaction with the advice.
Code
Key Functions
speak: Introduces the advisor and provides the solution.user_response: Captures user input (e.g., their problem).run_io_loop: Combines speaking the solution and listening for feedback.resume_normal_flow: Resumes the agent’s default workflow after interaction.
Example 2: Weather Capability
This Ability integrates a weather API to fetch and share weather updates based on user-provided locations.Code
Key Features
- External API Call: Fetches real-time weather data.
- Geolocation: Validates and processes user-provided locations.
- Error Handling: Provides meaningful feedback for invalid inputs.
Allowed/Disallowed Libraries and Patterns
The following imports, keywords, and patterns are not allowed in Abilities. Use the safe alternatives.Blocked Imports and Keywords
| Name | Why not allowed |
|---|---|
| redis | Direct datastore coupling and security concerns; not portable across deployments. |
| user_config | Raw config access can leak or mutate global state; use provided APIs on CapabilityWorker/worker. |
Bypasses structured logging; noisy and untraceable in production; use editor_logging_handler. | |
| open (raw) | Unmanaged filesystem access; security and portability risks; prefer approved helpers/per-user storage. |
- Avoid direct storage/infra access. Use platform-provided helpers within
CapabilityWorker/workeror request an API if needed. - Use the provided logging (
editor_logging_handler) instead of prints. - For files, prefer platform abstractions and per-user capability folders; ask for an approved helper if you need persistent storage.
Security Guidance
Avoid insecure or unsafe patterns such as runtimeassert checks, exec() of dynamic code, binding servers to all interfaces, hardcoded secrets, swallowing exceptions, insecure deserialization (pickle/dill/shelve/marshal), weak hashes like MD5, or weak cipher modes (e.g., ECB). If you have a special case, request approval and an approved wrapper/utility.
Conclusion
Building Abilities in OpenHome empowers developers to create custom functionalities for AI agents. With the examples like the Basic Advisor and Weather Capability, you can:- Core Communication: Use
speak,run_io_loop, anduser_responsefor basic interactions. - Advanced Audio: Play custom audio files, and stream audio data.
- Text Generation: Leverage multiple text-to-text options with history and system prompts.
- Voice Customization: Use specific voice IDs for varied and engaging responses.
- External APIs: Integrate third-party services for dynamic functionality.
CapabilityWorker provides all the tools needed to create sophisticated, interactive Abilities.
Start creating innovative Abilities and push the boundaries of voice AI with OpenHome! 🎉
Note: It is recommended to use the requests module to call third-party APIs and avoid using other libraries. If any other library is needed for a special case, you can request us to add it.

