Support MLX in WhisperAX #200

ZachNagengast · 2024-09-07T09:23:24Z

Add support for the WhisperAX example app, as well as various refactors and cleanup.

Note: this is using an unreleased version of MLX, pending merge of ml-explore/mlx-swift#130

There are still some memory issues to address, will look more into this soon.

CoreML:

MLX:

Important note, first time running this will likely throw an error about PrepareMetalShaders, which requires Trust & Enable on this popup when selecting the error.

…lity (#192) * Make additional initializers, functions, members public, for WKPro * Allows use of default internal functions & member accesses which have increased protections when imported * Initializers were Xcode generated: right click class name -> refactor -> generate memberwise initializers * memberwise initializer defaults to internal, mark as public. * Formatting --------- Co-authored-by: ZachNagengast <[email protected]>

… models (#193) * Add initial mlpackage loading (if .mlmodelc not present) -- Does not modify model loading in OS WK. This is a hook to modify load path URLs. * Always load audio encoder last * Adjust timings to account for decoder<>encoder order swap * Add helper for mlpackage detection --------- Co-authored-by: ZachNagengast <[email protected]>

* Fix start time logic for file loading and resampling * Add test file

As far as I can tell, these stored properties are not meant to be changed. Therefore, change them to be immutable. This change also makes these static properties concurrency-safe.

…nto mlx-support

* Add VoiceActivityDetector base class Add base class to allow different VAD implementations * fix spaces

jkrukowski

LGTM, added some comments, let me know what you think

jkrukowski · 2024-09-09T12:06:17Z

.swiftpm/configuration/Package.resolved

+=======
+>>>>>>> main


git conflict markers

Ah good catch 👍

jkrukowski · 2024-09-09T12:08:28Z

Package.resolved

-      "location" : "https://github.com/ml-explore/mlx-swift",
+      "location" : "https://github.com/davidkoski/mlx-swift.git",
      "state" : {
-        "revision" : "597aaa5f465b4b9a17c8646b751053f84e37925b",
-        "version" : "0.16.0"
+        "revision" : "3314bc684f0ccab1793be54acddaea16c0501d3c"


curious, why this change?

ok, now I can see why, nvm

This has been merged into mlx-swift and tagged 0.16.2

jkrukowski · 2024-09-09T12:11:14Z

Sources/WhisperKit/Core/Models.swift

@@ -171,6 +137,74 @@ public struct ModelComputeOptions {
    }
 }

+public struct ModelInfo: Identifiable, Hashable {
+    public let id = UUID()


curious, why this property public let id = UUID() is needed? can it be uniquely identified by name?

Needed to make it identifiable, although I was using this for the picker at one point and may be vestigial, will check.

jkrukowski · 2024-09-09T12:15:48Z

Sources/WhisperKit/MLX/MLXTextDecoder.swift

+        let keyCache = try? MLX.stacked(keyCacheResult).asMLMultiArray()
+        let valueCache = try? MLX.stacked(valueCacheResult).asMLMultiArray()
+        let decodingCache = DecodingCache(
+            keyCache: keyCache,
+            valueCache: valueCache,
+            alignmentWeights: nil
+        )
+
+        let logits = try? result.logits?.asMLMultiArray()


seems like you've changed it to try ? in couple of places, is it intended?

Good point, we want this to pass the throw up through the call stack. Reviewing build warnings as well.

iandundas · 2024-09-18T08:44:44Z

Hey all, just a heads-up, the example project gets this SPM error:

davidkoski · 2024-09-18T13:56:03Z

Hey all, just a heads-up, the example project gets this SPM error:

that fork was merged and is now the 0.16.2 tag on https://github.com/ml-explore/mlx-swift

ZachNagengast · 2024-09-18T14:18:01Z

Awesome, thanks for the update @davidkoski!

latenitefilms · 2024-09-28T22:42:46Z

@ZachNagengast - This looks awesome! Apologies - rookie question... It looks like currently the MLX repo only has the base and tiny models. How hard is it for mere mortals to "build" some of the larger models for testing this out with larger models?

latenitefilms · 2024-10-04T01:03:28Z

Or... can you use existing models with MLX?

ZachNagengast · 2024-10-04T01:08:34Z

Sorry missed your original message! We will fill in the mlx repo with the remaining models as part of this release, we just made these copies for consistency with our swift package. Any MLX whisper model currently existing with the same naming scheme will work in theory 👍 @jkrukowski may be able to confirm or deny.

latenitefilms · 2024-10-04T01:41:20Z

Legend, thanks @ZachNagengast! So basically, we do need new models for MLX, we can use the existing WhisperKit models? They need to be optimised or something?

ZachNagengast · 2024-10-04T01:58:40Z

Yep the existing WhisperKit models are optimized for CoreML, the ones in this repo we will fill out with the equivalent weights that are compatible with this MLX PR

latenitefilms · 2024-10-04T02:18:04Z

Sorry for all the rookie questions, but when you say "optimised for CoreML" - does this mean they ONLY work on CoreML, or can you use these CoreML models in MLX and they're just not as fast/accurate?

Apologies - this whole Whisper world is very new to me, so I very much appreciate all your wisdom and support!

maxlund · 2024-10-04T05:56:28Z

Yep the existing WhisperKit models are optimized for CoreML, the ones in this repo we will fill out with the equivalent weights that are compatible with this MLX PR

@ZachNagengast Is there any model conversion script we can run, or any other source we can use, in order to create/obtain more MLX compatible model versions?

ZachNagengast · 2024-10-04T07:15:54Z

@latenitefilms Yes the .mlmodelc models only work with CoreML at the moment.
@maxlund There is a script made by @jkrukowski to do the conversion here #169, we'll integrate this into https://github.com/argmaxinc/whisperkittools in the future.

latenitefilms · 2024-10-04T22:14:24Z

Legend, thanks so much @ZachNagengast! Do you have a rough/ballpark ETA of when you're hoping to finish and merge in MLX support? No rush or pressure - just wondering if it's worth trying to convert our own models or not.

Let me know if there's anything I can do to help with MLX testing/release! Would love to see this in action ASAP!

Thanks for EVERYTHING you do! Appreciate it!

ZachNagengast · 2024-10-05T21:53:33Z

There are just a few optimizations to fix up to make it ready for release, specifically memory usage. Current issues are:

MLX does not require a prewarm stage, so it should skip that. Currently its loading the model twice without freeing up the memory. Can also be solved by setting a cache limit or clearing the cache after load
KV cache should use this instead of mlmultiarrays
Sampling can be compiled for some easy speedups
Attention should use SDPA instead of current logic

These are paraphrased from @davidkoski and @awni

Will be revisiting this after the upcoming release but feel free to test with this current branch if you see any other potential speedup besides these, all the interfaces should be the same in its final form, just faster and more memory efficient with these changes.

latenitefilms · 2024-10-05T22:38:49Z

Amazing! Thanks so much! Will test out and let you know if I break anything.

anishjain123 · 2024-11-22T23:40:19Z

Any update on this? Is it usable? @latenitefilms @ZachNagengast ? Can someone please point me in the direction of how i can get this set up?

ZachNagengast · 2024-11-23T00:00:41Z

Hi @anishjain123 these are still pending issues #200 (comment), but this branch is technically usable. We'd like to resolve the perf and memory issues before merging, which is still a high priority for us! Working on a refactor right now to allow various different model input and output types, including MLXArray, which should help with the issues converting between MLMultiArray and MLXArray.

ZachNagengast and others added 15 commits July 15, 2024 08:24

Use fixed mlx-swift version

d7cf7a6

Fix language logits filter

c93d613

Fix start time logic for file loading (#195)

c268c8d

* Fix start time logic for file loading and resampling * Add test file

Change static var stored properties to static let. (#190)

59aaa4e

As far as I can tell, these stored properties are not meant to be changed. Therefore, change them to be immutable. This change also makes these static properties concurrency-safe.

Merge branch 'mlx-support' of ssh://github.com/argmaxinc/WhisperKit i…

4c495d2

…nto mlx-support

Refactor protocols for app support

6023f74

WIP perf improvements

cc10a23

Add VoiceActivityDetector base class (#199)

c03017f

* Add VoiceActivityDetector base class Add base class to allow different VAD implementations * fix spaces

Restructure package.swift

76d14db

Complete MLXTokenSampling impl

292cb16

Fix tests

649d139

Formatting

e09cc32

Merge branch 'main' into mlx-app-support

f7a6c09

jkrukowski reviewed Sep 9, 2024

View reviewed changes

Code review

9fff900

Update to latest mlx version

ce11d9b

ZachNagengast mentioned this pull request Nov 4, 2024

Updating swift-transformer #247

Open

BrandonWeng mentioned this pull request Nov 8, 2024

Bump swift-transformers to 0.1.12 #249

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support MLX in WhisperAX #200

Support MLX in WhisperAX #200

ZachNagengast commented Sep 7, 2024 •

edited

Loading

jkrukowski left a comment

jkrukowski Sep 9, 2024

ZachNagengast Sep 9, 2024

jkrukowski Sep 9, 2024

jkrukowski Sep 9, 2024

davidkoski Sep 10, 2024

jkrukowski Sep 9, 2024

ZachNagengast Sep 9, 2024 •

edited

Loading

jkrukowski Sep 9, 2024

ZachNagengast Sep 9, 2024

iandundas commented Sep 18, 2024

davidkoski commented Sep 18, 2024

ZachNagengast commented Sep 18, 2024

latenitefilms commented Sep 28, 2024

latenitefilms commented Oct 4, 2024

ZachNagengast commented Oct 4, 2024

latenitefilms commented Oct 4, 2024

ZachNagengast commented Oct 4, 2024

latenitefilms commented Oct 4, 2024

maxlund commented Oct 4, 2024 •

edited

Loading

ZachNagengast commented Oct 4, 2024

latenitefilms commented Oct 4, 2024

ZachNagengast commented Oct 5, 2024

latenitefilms commented Oct 5, 2024

anishjain123 commented Nov 22, 2024

ZachNagengast commented Nov 23, 2024

		=======
		>>>>>>> main

Support MLX in WhisperAX #200

Are you sure you want to change the base?

Support MLX in WhisperAX #200

Conversation

ZachNagengast commented Sep 7, 2024 • edited Loading

jkrukowski left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZachNagengast Sep 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iandundas commented Sep 18, 2024

davidkoski commented Sep 18, 2024

ZachNagengast commented Sep 18, 2024

latenitefilms commented Sep 28, 2024

latenitefilms commented Oct 4, 2024

ZachNagengast commented Oct 4, 2024

latenitefilms commented Oct 4, 2024

ZachNagengast commented Oct 4, 2024

latenitefilms commented Oct 4, 2024

maxlund commented Oct 4, 2024 • edited Loading

ZachNagengast commented Oct 4, 2024

latenitefilms commented Oct 4, 2024

ZachNagengast commented Oct 5, 2024

latenitefilms commented Oct 5, 2024

anishjain123 commented Nov 22, 2024

ZachNagengast commented Nov 23, 2024

ZachNagengast commented Sep 7, 2024 •

edited

Loading

ZachNagengast Sep 9, 2024 •

edited

Loading

maxlund commented Oct 4, 2024 •

edited

Loading