Skip to content

Instantly share code, notes, and snippets.

@Deep-unlearning
Last active January 21, 2025 12:34
Show Gist options
  • Save Deep-unlearning/0f1e417726fdae88f9910c954c087c39 to your computer and use it in GitHub Desktop.
Save Deep-unlearning/0f1e417726fdae88f9910c954c087c39 to your computer and use it in GitHub Desktop.
Use moonshine on transformers
from transformers import AutoProcessor, MoonshineForConditionalGeneration
from datasets import load_dataset
processor = AutoProcessor.from_pretrained("UsefulSensors/moonshine-tiny")
model = MoonshineForConditionalGeneration.from_pretrained("UsefulSensors/moonshine-tiny")
ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
inputs = processor([ds[0]["audio"]["array"], ds[0]["audio"]["array"]], return_tensors="pt")
input_values = inputs.input_values
token_limit_factor = 6.5 / processor.feature_extractor.sampling_rate # Maximum of 6.5 tokens per second
seq_lens = inputs.attention_mask.sum(dim=-1)
max_length = int((seq_lens * token_limit_factor).max().item())
generated_ids = model.generate(input_values, max_length=max_length)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=False)
print(transcription)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment