Skip to content

Commit

Permalink
docs: defines relative speed in README
Browse files Browse the repository at this point in the history
  • Loading branch information
philippefutureboy authored Nov 2, 2023
1 parent b38a1f2 commit ca40b1a
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ There are five model sizes, four with English-only versions, offering speed and
| medium | 769 M | `medium.en` | `medium` | ~5 GB | ~2x |
| large | 1550 M | N/A | `large` | ~10 GB | 1x |

Note that "relative speed" here refers to the speed of transcription of each model relative to each other, rather than relative to the duration of the sample to transcribe. Speed of transcription will vary based on the available hardware resources.

The `.en` models for English-only applications tend to perform better, especially for the `tiny.en` and `base.en` models. We observed that the difference becomes less significant for the `small.en` and `medium.en` models.

Whisper's performance varies widely depending on the language. The figure below shows a WER (Word Error Rate) breakdown by languages of the Fleurs dataset using the `large-v2` model (The smaller the numbers, the better the performance). Additional WER scores corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4. Meanwhile, more BLEU (Bilingual Evaluation Understudy) scores can be found in Appendix D.3. Both are found in [the paper](https://arxiv.org/abs/2212.04356).
Expand Down

0 comments on commit ca40b1a

Please sign in to comment.