Tutorial: How to Install Whisper AI

Whisper AI is a powerful speech-to-text model by OpenAI that allows for high-quality transcription. This guide walks you through the step-by-step installation process.


Step 1: Install Python

Whisper AI requires Python to run.

  • Download Python from python.org.
  • Ensure you install Python 3.8 or later (Whisper supports up to Python 3.11).
  • During installation, check the box to add Python to PATH.
  • Verify installation by running: python --version

Step 2: Install PyTorch

Whisper AI depends on PyTorch for deep learning functionalities.

  • Visit pytorch.org and follow the instructions for your system.
  • Example installation command: pip install torch torchvision torchaudio
  • Verify installation: python -c "import torch; print(torch.__version__)"

Step 3: Install Chocolatey (Windows Users Only)

Chocolatey is a package manager for Windows.

  • Open PowerShell as an administrator and run: Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
  • Verify installation: choco --version

Step 4: Install FFmpeg

FFmpeg is required for handling audio files.

  • Windows (via Chocolatey): choco install ffmpeg
  • macOS: brew install ffmpeg
  • Linux (Ubuntu/Debian): sudo apt update && sudo apt install ffmpeg
  • Verify installation: ffmpeg -version

Step 5: Install Whisper AI

  • Run the installation command: pip install -U openai-whisper
  • Alternatively, install directly from GitHub for the latest version: pip install git+https://github.com/openai/whisper.git
  • To update Whisper AI: pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
  • Verify installation: whisper --help

Step 6: Run a Test Transcription

To check if Whisper AI is working, run:

whisper example.mp3 --model small

This will generate a transcription of example.mp3.


Additional Features

  • Use Different Models: Whisper supports multiple models (tiny, small, medium, large). Example: whisper example.mp3 --model medium
  • Transcribe Multiple Files: whisper file1.mp3 file2.mp3
  • Specify Language: whisper example.mp3 --language English
  • Translate Non-English Speech to English: whisper example.mp3 --task translate

CUDA Compatibility for GPU Acceleration

If you have an NVIDIA GPU, you can speed up transcription with CUDA:

  • Install CUDA-compatible PyTorch by selecting the correct version from pytorch.org.
  • Install NVIDIA drivers and CUDA from NVIDIA.
  • Run Whisper using CUDA: whisper example.mp3 --model large --device cuda

Congratulations! 🎉 You have successfully installed and set up Whisper AI for transcription. For further details, visit the Whisper GitHub repository.

One thought on “Tutorial: How to Install Whisper AI

Leave a reply to Rail7 Cancel reply