Tutorial: How to Install Whisper AI

Whisper AI is a powerful speech-to-text model by OpenAI that allows for high-quality transcription. This guide walks you through the step-by-step installation process.


Step 1: Install Python

Whisper AI requires Python to run.

  • Download Python from python.org.
  • Ensure you install Python 3.8 or later (Whisper supports up to Python 3.11).
  • During installation, check the box to add Python to PATH.
  • Verify installation by running: python --version

Step 2: Install PyTorch

Whisper AI depends on PyTorch for deep learning functionalities.

  • Visit pytorch.org and follow the instructions for your system.
  • Example installation command: pip install torch torchvision torchaudio
  • Verify installation: python -c "import torch; print(torch.__version__)"

Step 3: Install Chocolatey (Windows Users Only)

Chocolatey is a package manager for Windows.

  • Open PowerShell as an administrator and run: Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))
  • Verify installation: choco --version

Step 4: Install FFmpeg

FFmpeg is required for handling audio files.

  • Windows (via Chocolatey): choco install ffmpeg
  • macOS: brew install ffmpeg
  • Linux (Ubuntu/Debian): sudo apt update && sudo apt install ffmpeg
  • Verify installation: ffmpeg -version

Step 5: Install Whisper AI

  • Run the installation command: pip install -U openai-whisper
  • Alternatively, install directly from GitHub for the latest version: pip install git+https://github.com/openai/whisper.git
  • To update Whisper AI: pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
  • Verify installation: whisper --help

Step 6: Run a Test Transcription

To check if Whisper AI is working, run:

whisper example.mp3 --model small

This will generate a transcription of example.mp3.


Additional Features

  • Use Different Models: Whisper supports multiple models (tiny, small, medium, large). Example: whisper example.mp3 --model medium
  • Transcribe Multiple Files: whisper file1.mp3 file2.mp3
  • Specify Language: whisper example.mp3 --language English
  • Translate Non-English Speech to English: whisper example.mp3 --task translate

CUDA Compatibility for GPU Acceleration

If you have an NVIDIA GPU, you can speed up transcription with CUDA:

  • Install CUDA-compatible PyTorch by selecting the correct version from pytorch.org.
  • Install NVIDIA drivers and CUDA from NVIDIA.
  • Run Whisper using CUDA: whisper example.mp3 --model large --device cuda

Congratulations! 🎉 You have successfully installed and set up Whisper AI for transcription. For further details, visit the Whisper GitHub repository.

FFmpeg Command Breakdown for Extracting Slides from a PowerPoint Video

Command:

ffmpeg -i "PresentationVideo.mp4" -filter_complex "select=gt(scene\,0.2)" "slides/%04d.jpg" -vsync vfr

Explanation of Each Component:

1. ffmpeg

  • The command-line tool used for video and audio processing.

2. -i "PresentationVideo.mp4"

  • Specifies the input video file: PresentationVideo.mp4 (the recorded PowerPoint or slide presentation).

3. -filter_complex "select=gt(scene\,0.2)"

  • -filter_complex: Enables complex filtering.
  • select=gt(scene,0.2):
    • Uses the scene detection filter to extract frames when significant slide transitions occur.
    • scene is a built-in metric that detects the difference between consecutive frames.
    • gt(scene,0.2):
      • gt() (greater than) selects frames where the scene change metric exceeds 20% (0.2).
      • This ensures that only major slide transitions are captured, avoiding minor visual changes.

4. "slides/%04d.jpg"

  • Saves the extracted slides as .jpg images in the slides/ directory. You must create this directory before using the command, or it will not work
  • %04d ensures images are numbered sequentially (e.g., 0001.jpg, 0002.jpg).

5. -vsync vfr

  • Ensures that only variable frame rate (VFR) frames matching the scene filter are kept, preventing unnecessary duplicate frames.

Effect of Changing the Scene Threshold (scene value)

Scene Change ThresholdEffect
0.05 (5%)Many frames extracted, including minor slide changes (animations, small transitions).
0.1 (10%)Fewer frames, capturing most major slide changes.
0.2 (20%)Extracts only clear slide transitions, ignoring minor visual changes.
0.3 (30%)Very few frames, capturing only the most drastic slide transitions.

Example with a Lower Scene Change Value

ffmpeg -i "PresentationVideo.mp4" -filter_complex "select=gt(scene\,0.1)" "slides/%04d.jpg" -vsync vfr
  • This command captures more frequent slide changes, useful if the slides have animations or frequent minor transitions.

Key Takeaways:

  • Lower scene values (e.g., 0.05–0.1) → Extract more frames, including small slide changes.
  • Higher scene values (e.g., 0.2–0.3) → Extract fewer frames, focusing on major slide transitions.
  • Adjust the value based on the type of PowerPoint presentation (animated vs. static slides).

Alt Class – Introduction to narrative algorithms using a full example

See the example below

Problem: Create a solution to find the area of a circle.

Problem Definition: Create a program which prompts the user to enter the radius of a circle, and outputs its area.

Algorithm in narrative form:

1. Welcome the user and proceed to prompt the user to enter the radius of the circle.

2. Store the entered radius in a variable.

3. Use the formula for the area of a circle: Area = π × radius² (where π is approximately 3.14159).

4. Calculate the area by squaring the radius and multiplying the result by π.

5. Output the calculated area of the circle to the user.

6. End the program with a thank you message.

Algorithm in Pseudocode:

START
    R ← 0
    AREA ← 0
    pi ← 3.14159    
    OUTPUT "Welcome to the Area of a Circle Calculator"    
    OUTPUT "Please enter the radius of the circle:"
    INPUT R    
    AREA ← pi * (R * R)    
    OUTPUT "The area of the circle is: ", AREA    
    OUTPUT "Thank you for using the calculator!"
STOP

See Flowgorithm File And flowchart graphic here: https://drive.google.com/drive/folders/1Lbf_A-SdV5F4P1YEFui1scGLc3FGVFG1?usp=sharing

© 2024  Vedesh Kungebeharry. All rights reserved. 

Assignment – Creating a 4 Scene Scratch Animation Using A Storyboard

  1. Storyboard Creation:
    • Create a storyboard with at least 4 scenes or backgrounds.
    • Include two characters speaking at least a total of 4 sentences per scene.
    • Incorporate at least 2 animations across the four scenes.
  2. Storyboard Development:
    • Begin with rough sketches in your notebook during class with guidance from your teacher.
    • Finalize the storyboard in your notebook with:
      • One column for the visual description of the scene (drawing).
      • Another column for the text narration, including character positioning and animation.
  3. Scratch Project Implementation:
    • Create a shared Scratch project based on your storyboard.

Submission Requirements:

  1. Your completed storyboard must be drawn in your ICT notebook. Submit visible screenshots of your notebook either as a series of images or within a PDF document.
  2. Include a link to your shared Scratch project. (See here for sharing a Scratch project: How to Share a Scratch Project)
  3. If working in a group, only the group leader needs to submit the above. Include the names of all other group members.

Rubric

CriteriaMarksDescription
Storyboard Visuals5
5All four scenes are clearly sketched with detailed visual descriptions. Each scene includes backgrounds, characters, and important objects.
4Three scenes are clearly sketched with detailed visual descriptions. One scene may lack minor details.
3Two scenes are clearly sketched with detailed visual descriptions. Two scenes may lack minor details.
2One scene is clearly sketched with detailed visual descriptions. Three scenes lack details.
1Minimal effort in sketching. Most scenes lack details.
0No visual descriptions provided.
Character Positioning & Text Narration5
5Each scene includes clear positioning of characters with accurate text narration of at least 4 sentences per scene. Dialogues are coherent and contribute to the story.
4Each scene includes clear positioning of characters with accurate text narration of at least 3 sentences per scene. Dialogues are mostly coherent.
3Each scene includes clear positioning of characters with accurate text narration of at least 2 sentences per scene. Dialogues are partially coherent.
2Scenes have characters and text narration, but positioning is unclear or text is less than 2 sentences per scene.
1Minimal effort in character positioning and text narration.
0No character positioning or text narration provided.
Animation Description3
3Two animations are clearly described across the four scenes, including details on how and when they occur.
2Two animations are mentioned but lack detail on how and when they occur.
1One animation is described with some detail.
0No animations described.
Implementation in Scratch2
2The Scratch project closely follows the storyboard, with all scenes, dialogues, and animations accurately implemented.
1The Scratch project follows the storyboard with minor deviations. Most scenes, dialogues, and animations are accurately implemented.
0The Scratch project does not follow the storyboard or is not implemented.
Creativity & Coherence1
1The story is creative, coherent, and engaging. It shows originality and thoughtful integration of elements.
0The story lacks creativity, coherence, or engagement. It appears rushed or poorly thought out.
Total15

© 2024  Vedesh Kungebeharry. All rights reserved. 

A Slanted Rectangle – Revision Exercise

Create a scratch Script/Program which uses the pen tool to draw a rectangle
slanted at a 45 degree angle. Ensure that the rectangle occupies a large   portion of the stage.

Submit a link to you scratch project for marking 

1  – When Green flag is clicked
1  – Immediately erasing all pen activity before drawing the rectangle
2 – orienting the pen to 45 degrees
2 – Correct use of a loop
2 – correctly drawing the length
2 – correctly drawing width
2 – correct use of the turn block
3 – Good positioning of the rectangle withing the stage

© 2024  Vedesh Kungebeharry. All rights reserved. 

Algorithm Solution: Using Computational Thinking to solve a Math problem


This is the solution and feedback for the problem blogged@ https://islandclass.wordpress.com/2018/05/25/using-computational-thinking-to-solve-a-math-problem/

Solution

Algorithm:
1. Start
2. Start with N numbers
3. Determine the number of pairs, p = N/2
4. Find the sum of the first and last number , AL=  Fn+Ln
5. Calculate sum, Sum =  p* AL
6. Present the Sum.
7. Stop

© 2024  Vedesh Kungebeharry. All rights reserved.