docs: defines relative speed in README

openai · Nov 2, 2023 · ca40b1a · ca40b1a
1 parent b38a1f2
commit ca40b1a
Showing 1 changed file with 2 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -68,6 +68,8 @@ There are five model sizes, four with English-only versions, offering speed and
 | medium |   769 M    |    `medium.en`     |      `medium`      |     ~5 GB     |      ~2x       |
 | large  |   1550 M   |        N/A         |      `large`       |    ~10 GB     |       1x       |
 
+Note that "relative speed" here refers to the speed of transcription of each model relative to each other, rather than relative to the duration of the sample to transcribe. Speed of transcription will vary based on the available hardware resources.
+
 The `.en` models for English-only applications tend to perform better, especially for the `tiny.en` and `base.en` models. We observed that the difference becomes less significant for the `small.en` and `medium.en` models.
 
 Whisper's performance varies widely depending on the language. The figure below shows a WER (Word Error Rate) breakdown by languages of the Fleurs dataset using the `large-v2` model (The smaller the numbers, the better the performance). Additional WER scores corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4. Meanwhile, more BLEU (Bilingual Evaluation Understudy) scores can be found in Appendix D.3. Both are found in [the paper](https://arxiv.org/abs/2212.04356).