Speech-to-Text

Listen to user voice input with session.events.onTranscription(). Get real-time speech-to-text transcription from the user’s microphone.

Basic Usage

session.events.onTranscription((data) => {
  if (data.isFinal) {
    session.logger.info('User said:', data.text);
  }
});

How It Works

User speaks into microphone
Audio streams to MentraOS Cloud
Speech recognition processes audio
Transcription events sent to your app
You receive interim and final results

Transcription Data

interface TranscriptionData {
  text: string;           // Transcribed text
  isFinal: boolean;       // True when transcription is complete
  language: string;       // Language code (e.g., 'en-US')
  confidence: number;     // Confidence score (0-1)
  timestamp: Date;        // When transcription was generated
}

Interim vs Final Results

Interim results - Partial transcription while user is speaking:

session.events.onTranscription((data) => {
  if (!data.isFinal) {
    // Show real-time preview
    session.layouts.showTextWall(`${data.text}...`);
  }
});

Final results - Complete transcription when user finishes:

session.events.onTranscription((data) => {
  if (data.isFinal) {
    // Process complete command
    session.layouts.showTextWall(data.text);
    await this.handleCommand(data.text);
  }
});

Common Patterns

Voice Commands

session.events.onTranscription((data) => {
  if (!data.isFinal) return;

  const command = data.text.toLowerCase();

  if (command.includes('help')) {
    this.showHelp(session);
  } else if (command.includes('weather')) {
    this.showWeather(session);
  } else if (command.includes('time')) {
    this.showTime(session);
  } else {
    session.layouts.showTextWall('Unknown command');
  }
});

Voice Search

session.events.onTranscription(async (data) => {
  if (!data.isFinal) {
    // Show what user is saying
    session.layouts.showTextWall(`Searching: ${data.text}...`);
    return;
  }

  // Perform search with final text
  session.layouts.showTextWall('Searching...');
  const results = await this.search(data.text);
  session.layouts.showReferenceCard('Results', results);
});

Voice Notes

session.events.onTranscription(async (data) => {
  if (!data.isFinal) return;

  // Save note
  const note = {
    text: data.text,
    timestamp: new Date(),
    confidence: data.confidence
  };

  await session.simpleStorage.set(
    `note_${Date.now()}`,
    JSON.stringify(note)
  );

  await session.audio.speak('Note saved');
});

Conversation

session.events.onTranscription(async (data) => {
  if (!data.isFinal) return;

  // Show what user said
  session.layouts.showDoubleTextWall({
    topText: 'You:',
    bottomText: data.text
  });

  // Generate response
  const response = await this.generateResponse(data.text);

  // Show and speak response
  session.layouts.showDoubleTextWall({
    topText: 'App:',
    bottomText: response
  });

  await session.audio.speak(response);
});

Confidence Checking

session.events.onTranscription((data) => {
  if (!data.isFinal) return;

  if (data.confidence < 0.5) {
    // Low confidence - ask for clarification
    session.layouts.showTextWall(`Did you say: ${data.text}?`);
    await session.audio.speak('Please repeat that');
  } else {
    // High confidence - process command
    this.processCommand(data.text);
  }
});

Language Support

Default language:

// Uses default language from user's device settings
session.events.onTranscription((data) => {
  session.logger.info('Language:', data.language);
  session.logger.info('Text:', data.text);
});

Multiple languages - Transcription automatically detects the spoken language based on device settings.

Best Practices

Always Check isFinal

Only process commands on final results:

// ✅ Good
session.events.onTranscription((data) => {
  if (!data.isFinal) return;
  this.processCommand(data.text);
});

// ❌ Avoid - processes interim results
session.events.onTranscription((data) => {
  this.processCommand(data.text); // Called too often
});

Show Visual Feedback

Display what the user said:

session.events.onTranscription((data) => {
  if (data.isFinal) {
    session.layouts.showTextWall(`You: ${data.text}`);
  }
});

Provide Audio Confirmation

Acknowledge user input:

session.events.onTranscription(async (data) => {
  if (!data.isFinal) return;

  await session.audio.speak('Got it');
  await this.processCommand(data.text);
});

Handle Errors Gracefully

User might say something unexpected:

session.events.onTranscription(async (data) => {
  if (!data.isFinal) return;

  try {
    await this.processCommand(data.text);
  } catch (error) {
    session.logger.error('Command failed:', error);
    await session.audio.speak('Sorry, I could not do that');
  }
});

Permissions Required

Transcription requires the MICROPHONE permission. Set this in the Developer Console.

// In Developer Console, add permission:
{
  "type": "MICROPHONE",
  "description": "To listen to your voice commands"
}

Unsubscribing

// Store unsubscribe function
const unsubscribe = session.events.onTranscription((data) => {
  session.logger.info(data.text);
});

// Later, stop listening
unsubscribe();

Example: Voice Assistant

class VoiceAssistant extends AppServer {
  protected async onSession(session: AppSession, sessionId: string, userId: string) {
    session.layouts.showTextWall('Voice Assistant Ready\nSay "help" for commands');

    session.events.onTranscription(async (data) => {
      if (!data.isFinal) {
        // Show interim results
        session.layouts.showTextWall(`Listening: ${data.text}...`);
        return;
      }

      // Process final command
      const command = data.text.toLowerCase().trim();

      session.logger.info('Command:', command, 'Confidence:', data.confidence);

      if (command.includes('help')) {
        await this.showHelp(session);
      } else if (command.includes('weather')) {
        await this.showWeather(session);
      } else if (command.includes('time')) {
        await this.showTime(session);
      } else if (command.includes('reminder')) {
        await this.setReminder(session, command);
      } else {
        session.layouts.showTextWall(`Unknown: ${data.text}`);
        await session.audio.speak('I did not understand that. Say help for commands.');
      }
    });
  }

  private async showHelp(session: AppSession) {
    const helpText = 'Commands:\n- Weather\n- Time\n- Set reminder';
    session.layouts.showReferenceCard('Help', helpText);
    await session.audio.speak('You can ask about weather, time, or set a reminder');
  }

  private async showWeather(session: AppSession) {
    // Fetch weather data
    const weather = await this.fetchWeather();
    session.layouts.showReferenceCard('Weather', `${weather.condition}, ${weather.temp}°F`);
    await session.audio.speak(`The weather is ${weather.condition} and ${weather.temp} degrees`);
  }

  private async showTime(session: AppSession) {
    const time = new Date().toLocaleTimeString();
    session.layouts.showTextWall(`Time: ${time}`);
    await session.audio.speak(`The time is ${time}`);
  }

  private async setReminder(session: AppSession, command: string) {
    // Parse reminder from command
    // "set reminder to call mom at 3pm"
    session.layouts.showTextWall('Reminder set!');
    await session.audio.speak('Reminder set');
  }
}

Troubleshooting

No Transcription Events

Check permission:

Ensure MICROPHONE permission is set in Developer Console
User must approve permission when installing app
Check logs for permission errors

Poor Transcription Quality

Possible causes:

Background noise
User speaking too quietly
Microphone quality
Non-standard accent or pronunciation

Check data.confidence to detect low-quality transcriptions.

Delayed Transcriptions

Network latency:

Transcription requires internet connection
Processing happens in cloud
Some delay is normal (typically < 1 second)

Performance Tips

Use Final Results Only

Avoid processing every interim result:

// Processes only when complete
if (data.isFinal) {
  await heavyProcessing(data.text);
}

Debounce Expensive Operations

If you must process interim results:

let timeoutId: NodeJS.Timeout;

session.events.onTranscription((data) => {
  clearTimeout(timeoutId);

  timeoutId = setTimeout(() => {
    this.updateSearch(data.text);
  }, 300); // Wait 300ms after user stops speaking
});

Cache Common Commands

Store frequently used command responses:

const responses = new Map<string, string>();

if (responses.has(command)) {
  await session.audio.speak(responses.get(command)!);
}

Next Steps

Text-to-Speech

Respond with voice synthesis

Audio Chunks

Process raw audio data

Event Manager

Complete event API reference

Permissions

Learn about permissions

Getting Started

Core Concepts

SDK Reference

Basic Usage

How It Works

Transcription Data

Interim vs Final Results

Common Patterns

Voice Commands

Voice Search

Voice Notes

Conversation

Confidence Checking

Language Support

Best Practices

Permissions Required

Unsubscribing

Example: Voice Assistant

Troubleshooting

Performance Tips

Next Steps

Text-to-Speech

Audio Chunks

Event Manager

Permissions

Getting Started

Core Concepts

SDK Reference

​Basic Usage

​How It Works

​Transcription Data

​Interim vs Final Results

​Common Patterns

​Voice Commands

​Voice Search

​Voice Notes

​Conversation

​Confidence Checking

​Language Support

​Best Practices

​Permissions Required

​Unsubscribing

​Example: Voice Assistant

​Troubleshooting

​Performance Tips

​Next Steps

Text-to-Speech

Audio Chunks

Event Manager

Permissions

Basic Usage

How It Works

Transcription Data

Interim vs Final Results

Common Patterns

Voice Commands

Voice Search

Voice Notes

Conversation

Confidence Checking

Language Support

Best Practices

Permissions Required

Unsubscribing

Example: Voice Assistant

Troubleshooting

Performance Tips

Next Steps