Back to guide
Markdown view
# Realtime voice assistant

Stream audio in, synthesize audio out, and handle turn taking.

- Date: Mar 27, 2025
- Reading time: 16 min
- Level: Intermediate
- Tags: Audio, Realtime, Streaming

## Takeaways
- Stream audio in small, consistent chunks.
- Maintain a session state for interruptions.
- Use speech synthesis with low latency configs.

## Audio ingestion

Use WebSocket streams for audio input and keep buffer sizes small for low latency.

## Turn taking

Detect interruptions and decide when to cancel or resume speech output.

## Latency budget

Measure every hop in the pipeline and keep total latency under your UX target.