logo
Hey HN! I built Para-speak to fit my workflow - I wanted something simple: press shortcut → talk → press shortcut → text appears at cursor.

Existing tooling wasn't sufficient, and at the same time NVIDIA's Parakeet model was showing excellent results online, so I wanted specifically to try that.

Para-speak is a Rust CLI that uses NVIDIA's Parakeet TDT model running locally via MLX on Apple Silicon.

Demo & details: https://elvin.engineering/blog/2025/09-10-para-speak-cli/

GitHub: https://github.com/elv1n/para-speak

Key features:

  - Configurable global shortcuts (double-tap, combinations, sequences)

  - Audio feedback

  - Text insertion at cursor position (emulating cmd+v)

  - Extensible controller system (e.g., manipulate Spotify volume during recording)

  - Idle uses minimal resources - about 10-15 MB of RAM on my MacBook Pro M1 on average load.
Current limitations:

  - macOS only for now

  - Shortcuts pass through to other apps (not consumed)
I've been using it daily for AI-assisted coding. The accuracy is surprisingly good for a small model.

Still very early stage. Would be nice to have daemon mode, proper installation (Homebrew, etc.), and add some UI feedback.

Would love your thoughts!


Loading...