SayToType Documentation
Complete guide to using SayToType for voice-to-text conversion
Getting Started
Download and Installation
Download the SayToType desktop application for Windows or macOS from our download page.
The app supports:
- Windows: Windows 10 or newer
- macOS: macOS 11 or newer (both Intel and Apple Silicon)
First Launch and Onboarding
When you first launch SayToType, you'll go through a guided 5-step onboarding process:
Step 1: Language Selection
- Choose your preferred interface language
- This sets the language for the app's menus and interface
- Your initial transcription language will also be set based on this choice
Step 2: Account Sign-In
- Sign in with your SayToType account using OAuth authentication
- This connects your license and enables cloud transcription features
Step 3: Permissions Setup macOS only
- Microphone Permission: Required for audio recording
- Accessibility Permission: Required for global hotkeys and text insertion
- Windows users can skip this step as permissions are handled automatically
Step 4: Setup Configuration
- Audio Device: Select your preferred microphone from the dropdown
- Hotkey Selection: Choose from platform-specific presets (see Hotkeys section)
- Language Preferences: Confirm or adjust your transcription languages
Step 5: Try It Out
- Test your setup with a live recording
- Press your selected hotkey to start recording
- Speak a few words to test the transcription
- Verify that audio is being captured and transcribed correctly
Platform-Specific Setup
macOS Setup
- Grant microphone and accessibility permissions when prompted
- System Preferences may open automatically—follow the prompts
- Default hotkey:
Fn+Control - Apple Silicon Macs can use local transcription for offline/private processing
Windows Setup
- No special permissions required—the app works out of the box
- Default hotkey:
Ctrl+`(backtick key) - Local transcription is not currently available on Windows
Note: After onboarding, the app will run in your system tray. Look for the microphone icon to access settings and start recording.
Basic Usage
Recording Your Voice
There are two ways to start recording:
- Keyboard Hotkey: Press your configured hotkey (default:
Fn+Controlon macOS,Ctrl+`on Windows) - Tray Menu: Right-click the tray icon → "Start Recording"
During Recording
- The recording window will appear with real-time audio visualization
- Speak clearly into your microphone
- Watch the audio waveform to confirm your voice is being captured
- Press
ESCkey to cancel the recording at any time - Press your hotkey again to stop recording and process the transcription
After Recording
- The transcribed and processed text is automatically pasted at your cursor position
- The recording window behavior depends on your settings:
- Auto-detect (default): Closes if text was pasted, stays open if not
- Always close: Window closes immediately after processing
- Always open: Window stays open for review
Understanding Transcription Modes
SayToType supports three types of transcription:
1. Cloud Transcription
- Default transcription method using SayToType's cloud service
- Supports all languages and features
- Requires internet connection
- Available on all platforms (Windows, macOS)
- Included in free tier
2. Local Transcription macOS Apple Silicon only
- On-device processing using Apple's Whisper models
- Complete privacy—no data sent to cloud
- Works offline without internet connection (after models are downloaded)
- 6 model options with different speed/accuracy trade-offs
- Free models: Standard, Standard English
- Premium models Premium: Pro, Pro Light, Ultra, Ultra Light
- Models must be downloaded before use
- Requires macOS with Apple Silicon (M1/M2/M3)
3. Custom Provider Premium
- Use your own API keys with supported providers
- Supported providers:
- OpenAI: whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe
- Mistral: voxtral-small-latest, voxtral-mini-latest
- Deepgram: nova-3, nova-2
- Custom: Any OpenAI-compatible API endpoint
- You control costs and data privacy
- Requires premium subscription
Mode Selection and Switching
- Tray Menu: Right-click tray icon → "Select Mode" to choose from your saved modes
- Recording Window: Click the mode selector in the left corner to switch modes
- The currently active mode is highlighted
- You can switch modes between recordings, but must finish or cancel the current recording first
Hotkeys and Shortcuts
SayToType uses customizable hotkeys to trigger recordings. The available hotkeys differ between macOS and Windows.
macOS Hotkeys
Choose from one of the following preset hotkey combinations:
| Hotkey | Description |
|---|---|
Fn+Control | Default hotkey (most common choice) |
Control+Option | Alternative modifier combination |
Control+Escape | Alternative using ESC key |
Fn+Shift | Uses function key with Shift |
Fn+Escape | Uses function key with ESC |
Windows Hotkeys
Choose from one of the following preset hotkey combinations:
| Hotkey | Description |
|---|---|
Ctrl+` | Default hotkey (backtick key, left of 1) |
Shift+` | Alternative using Shift |
Alt+Z | Alternative using Alt key |
Alt+X | Alternative using Alt key |
Win+Alt+Space | Three-key combination |
Universal Shortcuts
| Shortcut | Action | Platform |
|---|---|---|
ESC | Cancel current recording | All platforms |
Your Hotkey | Start/stop recording | All platforms |
Configuring Your Hotkey
- Open Settings from the tray menu
- Navigate to the "General" or "Configuration" tab
- Find the "Hotkey" section with visual keyboard configurator
- Select your preferred hotkey from the available presets
- Test the hotkey to ensure it works correctly
Tip: Choose a hotkey that doesn't conflict with other applications you use frequently. The default options are designed to minimize conflicts.
Creating and Managing Modes
Understanding Modes
Modes are customizable presets that define how your voice recordings are transcribed and processed. Each mode can specify:
- Transcription method (cloud, local, or custom provider)
- Input and output languages
- Custom instructions and formatting
- Provider and model selection (for custom modes)
Accessing Mode Settings
- Right-click the tray icon → "Settings"
- Navigate to the "Modes" tab
- Click "Add Mode" to create a new mode, or click an existing mode to edit it
Cloud Modes
Cloud modes use SayToType's transcription service and are available on all platforms.
Configuration Options
- Name: Give your mode a descriptive name (e.g., "Email Draft", "Quick Notes")
- Transcription Type: Select "Cloud"
- Input Language: The language you'll be speaking
Tip: Setting the correct input language significantly improves accuracy - Output Language: Choose a target language for automatic translation (optional)
Tip: Leave as "Auto" or same as input for no translation - Custom Prompt: Add specific instructions for formatting or processing
Examples: "Format as bullet points", "Write in formal tone", "Summarize key points"
Local Modes macOS Apple Silicon only
Local modes use on-device Whisper models for completely private, offline transcription.
Available Models
Six model options with different speed and accuracy trade-offs:
- Standard (Free): Multilingual, balanced performance
- Standard English (Free): English-only, optimized for English
- Pro Premium: Highest accuracy, best quality
- Pro Light Premium: High accuracy, faster than Pro
- Ultra Premium: Maximum speed, good for quick notes
- Ultra Light Premium: Fastest processing, smallest model
Configuration Options
- Name: Descriptive name for the mode
- Transcription Type: Select "Local"
- Model: Choose from the 6 available models
Important: You must download the model before using it (see Model Management below) - Language: Select input language (if supported by model)
- Custom Prompt: Add formatting instructions (processed after transcription)
Model Management
- Download: Models must be downloaded before use
- Select a model in mode settings
- Click the download button and wait for the model to finish downloading
- Download times vary depending on model size and internet speed
- Unload Timeout: Configure how long models stay in memory after use (Settings → General)
- Storage: Models are stored locally and require disk space (varies by model size, ~100MB to ~1.5GB)
Privacy Note: Local modes never send audio to the cloud. All processing happens on your device.
Custom Provider Modes Premium
Use your own API keys with third-party transcription providers.
Supported Providers
OpenAI
- Models: whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe
- Setup: Enter your OpenAI API key in Settings → Providers
- Use Case: High-quality transcription with OpenAI's latest models
Mistral
- Models: voxtral-small-latest, voxtral-mini-latest
- Setup: Enter your Mistral API key in Settings → Providers
- Use Case: European alternative with competitive pricing
Deepgram
- Models: nova-3, nova-2
- Setup: Enter your Deepgram API key in Settings → Providers
- Use Case: Fast, accurate transcription with real-time capabilities
Custom OpenAI-Compatible
- Setup: Enter custom API endpoint URL and API key
- Use Case: Self-hosted or other OpenAI-compatible services
Configuration Options
- Name: Descriptive name for the mode
- Transcription Type: Select "Custom Provider"
- Provider: Choose from configured providers (OpenAI, Mistral, Deepgram, Custom)
- Model: Select the model for the chosen provider
- Language: Input language (if supported by provider)
- Custom Prompt: Formatting instructions
Mode Management
Creating Modes
- Open Settings → Modes tab
- Click "Add Mode"
- Choose transcription type (Cloud, Local, or Custom Provider)
- Configure mode settings
- Save the mode
Editing Modes
- Click on any existing mode in the Modes tab to edit its settings
- Changes are saved automatically or when you click "Save"
Reordering Modes
- Drag and drop modes in the Modes tab to reorder them
- The order affects how modes appear in the mode selector
Deleting Modes
- Click the delete/trash icon next to a mode
- Confirm deletion when prompted
- Note: You cannot delete the last remaining mode
Example Mode Configurations
Email Draft (Cloud)
- Type: Cloud
- Name: "Email Draft"
- Input Language: English
- Custom Prompt: "Format as professional email with proper greeting, body, and closing"
Quick Notes (Cloud)
- Type: Cloud
- Name: "Quick Notes"
- Input Language: English
- Output Language: English
- Custom Prompt: "Format as bullet points with key ideas"
Translation (Cloud)
- Type: Cloud
- Name: "English to Spanish"
- Input Language: English
- Output Language: Spanish
- Custom Prompt: "Translate accurately while maintaining natural flow"
Private Notes (Local) macOS
- Type: Local
- Name: "Private Notes"
- Model: Standard English (Free)
- Custom Prompt: "Format as concise notes"
- Use Case: Offline, private note-taking
Custom API Transcription Premium
- Type: Custom Provider
- Name: "OpenAI Whisper"
- Provider: OpenAI
- Model: whisper-1
- Custom Prompt: "Clean transcription only"
Configuration and Settings
Access all settings by right-clicking the tray icon → "Settings"
General Configuration
App Language
- Change the interface language for menus and settings
- Located in Settings → General tab
- Requires app restart to take effect
Hotkey Configuration
- Select from platform-specific preset hotkeys
- Visual keyboard configurator shows your selection
- See Hotkeys section for all available options
- Test your hotkey immediately after selection
Recording Window Behavior
Choose how the recording window behaves after processing:
- Auto-detect (recommended): Closes if text was successfully pasted to your application, stays open if not
- Always close: Window closes immediately after processing completes
- Always open: Window remains open for you to review the transcription
Model Unload Timeout macOS only
- Configure how long local models stay loaded in memory
- Options: 5 minutes, 15 minutes, 30 minutes, 1 hour, Never unload
- Shorter timeouts save memory, longer timeouts improve performance for frequent use
- Only applies to local transcription models
Audio Settings
Input Device Selection
- Choose your preferred microphone from the dropdown menu
- Click "Refresh" to detect newly connected devices
- Test with the real-time audio visualization to confirm selection
Auto-Maximize Input Volume
- Toggle to automatically maximize microphone input level
- Helps ensure consistent recording volume
- Platform-specific implementation:
- macOS: Uses native volume control
- Windows: Uses Windows audio API
- May not work with all audio devices
Account Settings
Subscription Status
- View your current subscription tier (Free or Premium)
- Check premium trial status and remaining time
- See feature access (local paid models, custom providers)
Premium Trial
- New users get a 25-minute trial of premium features
- Trial countdown visible in account settings
- Access to paid local models and custom providers during trial
Sign Out
- Click "Sign Out" to disconnect your account
- Useful for switching accounts or troubleshooting
- Will require re-authentication on next launch
Platform-Specific Features
macOS Features
Permissions
- Microphone Permission: Required for audio recording
Grant in: System Preferences → Security & Privacy → Privacy → Microphone - Accessibility Permission: Required for global hotkeys and text insertion
Grant in: System Preferences → Security & Privacy → Privacy → Accessibility - The app will prompt you to grant these permissions during onboarding
Local Transcription (Apple Silicon Only)
- Available on Macs with M1, M2, M3, or newer Apple Silicon chips
- Uses on-device Whisper models for complete privacy
- 6 model options: 2 free, 4 premium
- Works completely offline—no internet required (after downloading models)
- Models must be downloaded before first use and are stored locally
- Configurable memory management with unload timeout
Function Key Support
- Hotkeys can use the Fn (Function) key
- Three Fn-based presets available: Fn+Control, Fn+Shift, Fn+Escape
- Useful if other modifier combinations conflict with system shortcuts
Windows Features
No Permission Prompts
- Windows handles microphone permissions automatically
- No accessibility permissions needed
- App works out of the box after installation
- No system preferences configuration required
Platform Comparison
| Feature | macOS | Windows |
|---|---|---|
| Cloud Transcription | ✓ Available | ✓ Available |
| Local Transcription | ✓ Apple Silicon only | ✗ Not available |
| Custom Providers | ✓ Premium | ✓ Premium |
| Permissions Required | ✓ Microphone + Accessibility | ✗ None (automatic) |
| Default Hotkey | Fn+Control | Ctrl+` |
| Function Key Support | ✓ Fn key available | ✗ Not available |
Subscription and Premium Features
Free Tier
Included at no cost:
- Cloud Transcription: Unlimited use of SayToType's cloud transcription service
- Cloud Modes: Create unlimited cloud-based modes with custom languages and prompts
- Free Local Models macOS Apple Silicon: Access to 2 free on-device models
- Standard (multilingual)
- Standard English
- All Core Features: Hotkey customization, audio settings, mode management
Premium Features
Unlock with a premium subscription:
- Premium Local Models macOS Apple Silicon: Access to 4 premium on-device models
- Pro (highest accuracy, best quality)
- Pro Light (high accuracy, faster)
- Ultra (maximum speed)
- Ultra Light (fastest processing)
- Custom API Providers: Use your own API keys with:
- OpenAI (whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe)
- Mistral (voxtral-small-latest, voxtral-mini-latest)
- Deepgram (nova-3, nova-2)
- Custom OpenAI-compatible endpoints
- Provider Management: Full control over API keys and custom endpoints
Premium Trial
- Duration: 25 minutes of premium feature access
- Included: Try premium local models and custom providers before subscribing
- One-time offer: Trial is available once per account
Managing Your Subscription
- View subscription status in Settings → Account
- Upgrade to Premium from Settings → Account
- Cancel anytime—premium features remain active until end of billing period
Note: Premium subscription unlocks features on all your devices where you're signed in.
Tips for Best Results
Audio Quality
- Use a quality microphone: External or headset mics generally perform better than built-in laptop microphones
- Speak clearly: Enunciate at a normal, conversational pace—neither too fast nor too slow
- Minimize background noise: Record in quiet environments when possible
- Check audio levels: Watch the real-time visualization to ensure your voice is being captured
- Enable auto-maximize volume: This feature helps maintain consistent recording levels
- Test different devices: Try different microphones to find which works best for your setup
Language Settings
- Set correct input language: This is critical for accuracy—always match your spoken language
- Use translation feature wisely: Set different input/output languages for automatic translation
- Create language-specific modes: Make separate modes for each language you use regularly
- English-only models macOS: Use "Standard English" local model for better English-only accuracy
Custom Prompts
- Be specific: "Format as bullet points with action items" is better than "make it nice"
- Include formatting details: Specify structure, tone, style (e.g., "professional email format", "casual tone")
- Test and iterate: Try different prompts to see what produces the best results for your use case
- Keep it concise: Long prompts can dilute effectiveness—aim for 1-2 sentences
- Provide examples: If needed, include a brief example of desired output format in your prompt
Mode Organization
- Create specialized modes: Different modes for emails, notes, documentation, etc.
- Use descriptive names: "Project Notes" is more helpful than "Mode 3"
- Reorder frequently used modes: Drag your most-used modes to the top of the list
- Test before important use: Try new modes with test recordings before relying on them
- Leverage local modes for privacy macOS: Use local transcription for sensitive or confidential content
Choosing Transcription Type
- Cloud (default): Best for general use, all languages, requires internet
- Local macOS: Best for privacy, offline use, or slow internet connections
Tip: Download models in advance if you plan to use them offline - Custom Provider Premium: Best when you need specific provider features or want to control costs
Troubleshooting
Common Issues
No Audio Detected
- Verify microphone is selected in Settings → Audio Settings
- Check microphone permissions (macOS: System Preferences → Security & Privacy → Privacy → Microphone)
- Test microphone in another application to confirm it's working
- Click "Refresh" in audio device selection to detect newly connected devices
- Try enabling "Auto-maximize input volume" in audio settings
- Watch the audio visualization—if no waveform appears, your mic isn't capturing audio
Poor Transcription Quality
- Check language setting: Ensure input language matches the language you're speaking
- Improve audio quality: Reduce background noise, speak more clearly, check microphone positioning
- Try different models macOS: Standard English may work better for English speech than multilingual Standard
- Adjust custom prompts: Overly complex prompts can sometimes reduce accuracy
- Consider cloud vs local: Cloud transcription may be more accurate for some languages/accents
Hotkey Not Working
- macOS: Verify Accessibility permission is granted in System Preferences → Security & Privacy → Privacy → Accessibility
- Check for conflicts: Another app might be using the same hotkey—try a different preset
- Restart the app: Sometimes hotkey registration requires an app restart
- Test alternative hotkeys: Try different presets from the Hotkeys section
Authentication Problems
- Try signing out and signing back in via Settings → Account
- Check your internet connection
- Clear browser cookies if using OAuth sign-in
- Verify your account is active and in good standing
Permission Issues (macOS)
- Microphone Permission: System Preferences → Security & Privacy → Privacy → Microphone → Enable SayToType
- Accessibility Permission: System Preferences → Security & Privacy → Privacy → Accessibility → Enable SayToType
- Restart the app after granting permissions
- Some macOS versions require unlocking the padlock to make changes
Model Download Failures macOS
- Check your internet connection before starting download
- Ensure sufficient disk space for model files (models vary from ~100MB to ~1.5GB)
- If download fails, try again—temporary network issues can interrupt downloads
- Check firewall settings aren't blocking the download
- Remember: Models must be fully downloaded before you can use them for transcription
Platform-Specific Issues
macOS Troubleshooting
- Local models not available: Requires Apple Silicon (M1/M2/M3)—not available on Intel Macs
- App not in accessibility list: Manually add SayToType by clicking the "+" button in Accessibility preferences
- Hotkey conflicts with system shortcuts: Try Fn-based presets (Fn+Control, Fn+Shift, Fn+Escape)
Windows Troubleshooting
- Text not pasting: Ensure cursor is in a text-editable field when recording completes
- Hotkey not registering: Check if another app is using the same global hotkey
- Microphone access: Windows 10/11 may require app permissions in Settings → Privacy → Microphone
Getting Help
- Verify all required permissions are granted for your platform
- Restart the application after changing permissions or settings
- Check Settings → Account to ensure you're signed in and subscription is active
- Visit our Help & Support page for additional assistance
- Contact support with specific error messages or behavior descriptions
Advanced Features
Auto-Translation Workflows
- Set different input and output languages for instant translation
- Perfect for multilingual communication, language learning, or international collaboration
- Create dedicated translation modes for common language pairs
- Example: Speak in English, get output in Spanish/French/Japanese
- Works with cloud transcription (all platforms) and some custom providers
Custom Formatting with Prompts
- Use detailed prompts to format output for specific purposes
- Examples:
- Documentation: "Create markdown documentation with headings and sections"
- Structured notes: "Organize as bullet points with main ideas and key takeaways"
- Social media: "Write as engaging tweet under 280 characters"
- Combine language translation with custom formatting for powerful workflows
Provider Management Premium
- Configure multiple custom providers simultaneously
- Store API keys securely in the app
- Create different modes using different providers for different use cases
- Switch between providers based on cost, accuracy, or feature requirements
- Set up custom OpenAI-compatible endpoints for self-hosted solutions
Offline Operation macOS Apple Silicon
- Download local models in advance for completely offline transcription
- Models must be pre-downloaded while internet is available
- Once downloaded, no internet required for transcription
- Perfect for travel, secure environments, or unreliable connectivity
- All processing happens on-device—complete data privacy
Workflow Integration Tips
- Email drafting: Create mode with "professional email" prompt, speak naturally, get formatted email
- Note taking: Use "structured notes" prompt to auto-organize your thoughts into bullet points
- Multilingual content: Record in native language, auto-translate to target audience language
Note: Transcribed and processed text automatically appears at your cursor position in any application. The app intelligently pastes text only when successful transcription occurs.
