Skip to main content

Voice-Activation Tutorial

Learn how to build an MentraOS App that:
  1. Listens for live speech transcriptions provided by the system.
  2. Detects a custom activation phrase (for example “computer”).
  3. Executes an action—in this guide we’ll simply display a text overlay.
Looking for a broader introduction? Start with the Quickstart guide. This page focuses specifically on the app code that handles transcriptions.

Prerequisites

  1. MentraOS SDK ≥ 0.13.0 installed in your project.
  2. A local development environment configured as described in Getting Started.
  3. MICROPHONE permission added to your App in the Developer Console so the transcription stream is available. See Permissions.

1 - Set up the Project

Create a new project—or reuse an existing one—and install the SDK:
mkdir voice-activation-app
cd voice-activation-app
bun init -y           # or npm init -y / pnpm init -y
bun add @mentra/sdk
bun add -d typescript tsx @types/node
Copy the basic project structure from the Quickstart if you haven’t already. We’ll focus on the contents of src/index.ts.

2 - Write the App Code

The full source code is shown first, followed by a step-by-step explanation.
src/index.ts
import { AppServer, AppSession } from "@mentra/sdk";

/**
 * A custom keyword that triggers our action once detected in speech
 */
const ACTIVATION_PHRASE = "computer";

/**
 * VoiceActivationServer – an App that listens for final transcriptions and
 * reacts when the user utters the ACTIVATION_PHRASE.
 */
class VoiceActivationServer extends AppServer {
  /**
   * onSession is called automatically whenever a user connects.
   *
   * @param session   – Connection-scoped helper APIs and event emitters
   * @param sessionId – Unique identifier for this connection
   * @param userId    – MentraOS user identifier
   */
  protected async onSession(
    session: AppSession,
    sessionId: string,
    userId: string,
  ): Promise<void> {
    session.logger.info(`🔊  Session ${sessionId} started for ${userId}`);

    // 1️⃣  Subscribe to speech transcriptions
    const unsubscribe = session.events.onTranscription((data) => {
      // 2️⃣  Ignore interim results – we only care about the final text
      if (!data.isFinal) return;

      // 3️⃣  Normalize casing & whitespace for a simple comparison
      const spokenText = data.text.toLowerCase().trim();
      session.logger.debug(`Heard: "${spokenText}"`);

      // 4️⃣  Check for the activation phrase
      if (spokenText.includes(ACTIVATION_PHRASE)) {
        session.logger.info("✨ Activation phrase detected!");

        // 5️⃣  Do something useful – here we show a text overlay
        session.layouts.showTextWall("👋 How can I help?");
      }
    });

    // 6️⃣  Clean up the listener when the session ends
    this.addCleanupHandler(unsubscribe);
  }
}

// Bootstrap the server using environment variables for configuration
new VoiceActivationServer({
  packageName: process.env.PACKAGE_NAME ?? "com.example.voiceactivation",
  apiKey: process.env.MENTRAOS_API_KEY!,
  port: Number(process.env.PORT ?? "3000"),
}).start();

What Does Each Part Do?

#CodePurpose
1️⃣session.events.onTranscriptionSubscribes to real-time speech data. The callback fires many times per utterance—both interim and final chunks.
2️⃣if (!data.isFinal) return;Filters out interim chunks so we only process complete sentences.
3️⃣spokenText.toLowerCase().trim()Normalizes the text to improve keyword matching.
4️⃣if (spokenText.includes(...))Simple string containment check for the activation phrase.
5️⃣session.layouts.showTextWall(...)Shows a full-screen text overlay on the glasses. Replace with your own logic.
6️⃣this.addCleanupHandler(unsubscribe)Ensures the transcription listener is removed when the session disconnects, preventing memory leaks.

3 - Run the App

  1. Add the required environment variables in .env:
    PORT=3000
    PACKAGE_NAME=com.example.voiceactivation
    MENTRAOS_API_KEY=your_api_key_here
    
  2. Start the development server:
    bun --watch src/index.ts      # auto-reload on change
    # or build & run
    bun run build && bun run start
    
  3. Expose the port with ngrok (or your tunnel of choice) so MentraOS on your phone can reach it, then restart the App inside MentraOS.

Best Practices

  • Keep the activation phrase natural – Short, memorable words work best.
  • Provide user feedback – After detecting the phrase, give immediate visual or auditory confirmation.
  • Avoid hard-coding – Store configurable keywords in Settings so users can change them.
  • Review permissions – Request only the data your App genuinely needs. See Permissions.

Next Steps

  • Explore more event types in the Events reference.
  • Combine voice activation with AI Tools to let users control your App via natural language.
  • Add context-aware responses by fetching user location or calendar data—just remember to declare the corresponding permissions.
I