Whisper
Introduction
This notebook has been automatically translated to make it accessible to more people, please let me know if you see any typos.
Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of supervised multilingual and multitask data collected from the web. The use of such a large and diverse data set leads to greater robustness to accents, background noise and technical language. In addition, it allows for transcription in multiple languages, as well as translation from those languages into English.
Installation
In order to install this tool, it is best to create a new anaconda environment.
!conda create -n whisper
We enter the environment
!conda create -n whisper!conda activate whisper
We install all the necessary packages
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
Finally we install whisper
.
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git
And we update ffmpeg
.
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpeg
Use
We import whisper
.
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper
We select the model, the bigger the better it will do it
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)
We uploaded the audio of this old (1987) Micro Machines advert
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)audio_path = "MicroMachines.mp3"audio = whisper.load_audio(audio_path)audio = whisper.pad_or_trim(audio)
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)audio_path = "MicroMachines.mp3"audio = whisper.load_audio(audio_path)audio = whisper.pad_or_trim(audio)mel = whisper.log_mel_spectrogram(audio).to(model.device)
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)audio_path = "MicroMachines.mp3"audio = whisper.load_audio(audio_path)audio = whisper.pad_or_trim(audio)mel = whisper.log_mel_spectrogram(audio).to(model.device)_, probs = model.detect_language(mel)print(f"Detected language: {max(probs, key=probs.get)}")
Detected language: en
options = whisper.DecodingOptions()result = whisper.decode(model, mel, options)
options = whisper.DecodingOptions()result = whisper.decode(model, mel, options)result.text
"This is the Micro Machine Man presenting the most midget miniature motorcade of micro machines. Each one has dramatic details, terrific trim, precision paint jobs, plus incredible micro machine pocket play sets. There's a police station, fire station, restaurant, service station, and more. Perfect pocket portables to take any place. And there are many miniature play sets to play with and each one comes with its own special edition micro machine vehicle and fun fantastic features that miraculously move. Raise the boat lift at the airport, marina, man the gun turret at the army base, clean your car at the car wash, raise the toll bridge. And these play sets fit together to form a micro machine world. Micro machine pocket play sets so tremendously tiny, so perfectly precise, so dazzlingly detailed, you'll want to pocket them all. Micro machines and micro machine pocket play sets sold separately from Galoob. The smaller they are, the better they are."