Skip to main content
Convert text to natural speech on smart glasses using session.audio.speak(). Powered by ElevenLabs TTS for high-quality voice synthesis.

Basic Usage

// Simple TTS
await session.audio.speak('Hello from your app!');

How It Works

  1. Your app calls session.audio.speak(text)
  2. MentraOS Cloud generates audio using ElevenLabs TTS
  3. Audio streams to the user’s device
  4. Plays through glasses speakers or phone

Voice Customization

Custom Voice

await session.audio.speak('Welcome back!', {
  voice_id: 'your_voice_id'
});

Voice Settings

await session.audio.speak('Custom voice settings', {
  voice_id: 'adam',
  voice_settings: {
    stability: 0.5,      // 0-1: Lower = more expressive
    similarity_boost: 0.75, // 0-1: Voice similarity
    style: 0.5,          // 0-1: Style exaggeration
    speed: 1.2           // Playback speed
  }
});

Volume Control

await session.audio.speak('Quiet message', {
  volume: 0.5  // 0.0 - 1.0
});

TTS Options

OptionTypeDefaultDescription
voice_idstringServer defaultElevenLabs voice ID
model_idstringeleven_flash_v2_5TTS model to use
voice_settingsobject-Voice customization
volumenumber1.0Volume level (0.0-1.0)

Voice Settings Options

SettingTypeRangeDescription
stabilitynumber0-1Lower = more expressive
similarity_boostnumber0-1Voice similarity to original
stylenumber0-1Style exaggeration
speednumber0.5-2.0Playback speed
use_speaker_boostboolean-Enhance speaker clarity

Common Patterns

Voice Confirmation

session.events.onTranscription(async (data) => {
  if (data.isFinal) {
    // Acknowledge user input
    await session.audio.speak('Got it!');

    // Process command
    const result = await processCommand(data.text);

    // Speak result
    await session.audio.speak(result);
  }
});

Notifications

session.events.onPhoneNotification(async (data) => {
  if (data.app === 'Messages') {
    await session.audio.speak(`New message from ${data.title}`);
  }
});

Multi-Step Instructions

async function guideUser(session: AppSession) {
  await session.audio.speak('Step 1: Look at the target');
  await sleep(2000);

  await session.audio.speak('Step 2: Press the button');
  await sleep(2000);

  await session.audio.speak('Step 3: Wait for confirmation');
}

Contextual Responses

session.events.onTranscription(async (data) => {
  if (!data.isFinal) return;

  const text = data.text.toLowerCase();

  if (text.includes('weather')) {
    await session.audio.speak('The weather is sunny and 72 degrees');
  } else if (text.includes('time')) {
    const time = new Date().toLocaleTimeString();
    await session.audio.speak(`The time is ${time}`);
  } else if (text.includes('help')) {
    await session.audio.speak('You can ask about weather, time, or directions');
  }
});

Error Handling

try {
  const result = await session.audio.speak('Hello!');

  if (result.success) {
    session.logger.info('TTS played successfully');
  } else {
    session.logger.error('TTS failed:', result.error);
  }
} catch (error) {
  session.logger.error('TTS error:', error);
  // Fallback to text display
  session.layouts.showTextWall('Hello!');
}

Best Practices

Short messages are easier to understand:
// ✅ Good
await session.audio.speak('Message sent');

// ❌ Avoid - too long
await session.audio.speak(
  'Your message has been successfully sent to the recipient and they will receive it shortly'
);
Don’t overlap audio:
// ✅ Good - sequential
await session.audio.speak('First message');
await session.audio.speak('Second message');

// ❌ Avoid - messages overlap
session.audio.speak('First message');  // No await
session.audio.speak('Second message'); // Plays immediately
Combine audio with visual cues:
// ✅ Good
session.layouts.showTextWall('Processing...');
await session.audio.speak('Processing your request');

// User gets both visual and audio feedback
Not all devices have displays:
if (session.capabilities?.hasDisplay) {
  session.layouts.showTextWall('Hello!');
}

// Always provide audio for accessibility
await session.audio.speak('Hello!');

Voice Selection

Finding Voice IDs:
  1. Visit ElevenLabs Voice Library
  2. Choose a voice
  3. Copy the voice ID
  4. Use in your app
// Popular voices
await session.audio.speak('Professional voice', {
  voice_id: 'adam'  // Male, professional
});

await session.audio.speak('Friendly voice', {
  voice_id: 'rachel'  // Female, friendly
});
Configure default voice ID in your environment variables: ELEVENLABS_DEFAULT_VOICE_ID

Troubleshooting

Audio Not Playing

Check device capabilities:
if (!session.capabilities) {
  session.logger.warn('Capabilities not loaded');
  return;
}

// Note: Audio should work on all devices
// It routes through phone if glasses don't have speakers

TTS Timing Out

TTS has 60 second timeout:
try {
  await session.audio.speak('Your message');
} catch (error) {
  if (error.includes('timeout')) {
    session.logger.error('TTS timed out');
    // Retry or fallback to text
  }
}

Voice Quality Issues

Adjust voice settings:
// For clearer speech
await session.audio.speak('Clear message', {
  voice_settings: {
    stability: 0.7,       // More stable
    similarity_boost: 0.5, // Less similarity boost
    speed: 0.9            // Slightly slower
  }
});

Performance Tips

Pre-generate audio for frequently used phrases:
// Consider using pre-recorded audio files
// for very common phrases
await session.audio.playAudio({
  audioUrl: 'https://your-cdn.com/hello.mp3'
});
Faster speed = less time = better UX:
// For quick confirmations
await session.audio.speak('Done', {
  voice_settings: { speed: 1.3 }
});

Return Value

The speak() method returns a promise that resolves with:
interface AudioPlayResult {
  success: boolean;
  error?: string;
  duration?: number;  // Audio duration in seconds
}
Example:
const result = await session.audio.speak('Hello!');

if (result.success) {
  session.logger.info(`Played ${result.duration}s of audio`);
} else {
  session.logger.error('Failed:', result.error);
}

Next Steps