Whisper
Introducción
Whisper
es un sistema de reconocimiento automático de voz (automatic speech recognition (ASR)) entrenado en 680.000 horas de datos supervisados multilingües y multitarea recopilados de la web. El uso de un conjunto de datos tan grande y diverso conduce a una mayor solidez ante los acentos, el ruido de fondo y el lenguaje técnico. Además, permite la transcripción en varios idiomas, así como la traducción de esos idiomas al inglés
Instalación
Para poder instalar esta herramienta lo mejor es crearse un nuevo entorno de anaconda
!conda create -n whisper
Entramos al entorno
!conda create -n whisper!conda activate whisper
Instalamos todos los paquetes necesarios
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
Por último instalamos whisper
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git
Y actualizamos ffmpeg
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpeg
Uso
Importamos whisper
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper
Seleccionamos el modelo, cuanto más grande mejor lo hará
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)
Cargamos el audio de este anuncio antiguo (de 1987) de Micro Machines
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)audio_path = "MicroMachines.mp3"audio = whisper.load_audio(audio_path)audio = whisper.pad_or_trim(audio)
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)audio_path = "MicroMachines.mp3"audio = whisper.load_audio(audio_path)audio = whisper.pad_or_trim(audio)mel = whisper.log_mel_spectrogram(audio).to(model.device)
!conda create -n whisper!conda activate whisper!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia!pip install git+https://github.com/openai/whisper.git!sudo apt update && sudo apt install ffmpegimport whisper# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)audio_path = "MicroMachines.mp3"audio = whisper.load_audio(audio_path)audio = whisper.pad_or_trim(audio)mel = whisper.log_mel_spectrogram(audio).to(model.device)_, probs = model.detect_language(mel)print(f"Detected language: {max(probs, key=probs.get)}")
Detected language: en
options = whisper.DecodingOptions()result = whisper.decode(model, mel, options)
options = whisper.DecodingOptions()result = whisper.decode(model, mel, options)result.text
"This is the Micro Machine Man presenting the most midget miniature motorcade of micro machines. Each one has dramatic details, terrific trim, precision paint jobs, plus incredible micro machine pocket play sets. There's a police station, fire station, restaurant, service station, and more. Perfect pocket portables to take any place. And there are many miniature play sets to play with and each one comes with its own special edition micro machine vehicle and fun fantastic features that miraculously move. Raise the boat lift at the airport, marina, man the gun turret at the army base, clean your car at the car wash, raise the toll bridge. And these play sets fit together to form a micro machine world. Micro machine pocket play sets so tremendously tiny, so perfectly precise, so dazzlingly detailed, you'll want to pocket them all. Micro machines and micro machine pocket play sets sold separately from Galoob. The smaller they are, the better they are."