Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Plotting TensorDictSequential graphs #1144

Open
wants to merge 2 commits into
base: gh/vmoens/38/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 18, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 18, 2024
vmoens added a commit that referenced this pull request Dec 18, 2024
ghstack-source-id: ff93fb45f6d64b3ab960cc801631923305b879ca
Pull Request resolved: #1144
Copy link

github-actions bot commented Dec 18, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}29$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.8540μs 21.2707μs 47.0129 KOps/s 51.3039 KOps/s $\textbf{\color{#d91a1a}-8.36\%}$
test_plain_set_stack_nested 78.1460μs 21.4907μs 46.5318 KOps/s 50.1711 KOps/s $\textbf{\color{#d91a1a}-7.25\%}$
test_plain_set_nested_inplace 64.8010μs 23.3747μs 42.7813 KOps/s 45.9256 KOps/s $\textbf{\color{#d91a1a}-6.85\%}$
test_plain_set_stack_nested_inplace 80.8110μs 23.2030μs 43.0979 KOps/s 46.3982 KOps/s $\textbf{\color{#d91a1a}-7.11\%}$
test_items 42.1480μs 4.1195μs 242.7505 KOps/s 237.0291 KOps/s $\color{#35bf28}+2.41\%$
test_items_nested 0.7660ms 0.4118ms 2.4285 KOps/s 2.4528 KOps/s $\color{#d91a1a}-0.99\%$
test_items_nested_locked 0.5981ms 0.4128ms 2.4226 KOps/s 2.4644 KOps/s $\color{#d91a1a}-1.69\%$
test_items_nested_leaf 0.1463ms 76.4647μs 13.0779 KOps/s 12.8727 KOps/s $\color{#35bf28}+1.59\%$
test_items_stack_nested 0.6066ms 0.4138ms 2.4168 KOps/s 2.4293 KOps/s $\color{#d91a1a}-0.51\%$
test_items_stack_nested_leaf 0.1508ms 79.3773μs 12.5981 KOps/s 12.4031 KOps/s $\color{#35bf28}+1.57\%$
test_items_stack_nested_locked 0.5594ms 0.4144ms 2.4132 KOps/s 2.4338 KOps/s $\color{#d91a1a}-0.84\%$
test_keys 41.2070μs 3.5498μs 281.7097 KOps/s 284.6820 KOps/s $\color{#d91a1a}-1.04\%$
test_keys_nested 0.2766ms 0.1691ms 5.9121 KOps/s 6.0780 KOps/s $\color{#d91a1a}-2.73\%$
test_keys_nested_locked 1.8430ms 0.1749ms 5.7177 KOps/s 5.8308 KOps/s $\color{#d91a1a}-1.94\%$
test_keys_nested_leaf 0.2659ms 0.1469ms 6.8066 KOps/s 7.0099 KOps/s $\color{#d91a1a}-2.90\%$
test_keys_stack_nested 0.2666ms 0.1658ms 6.0331 KOps/s 6.1684 KOps/s $\color{#d91a1a}-2.19\%$
test_keys_stack_nested_leaf 0.2097ms 0.1452ms 6.8848 KOps/s 7.1286 KOps/s $\color{#d91a1a}-3.42\%$
test_keys_stack_nested_locked 0.2854ms 0.1713ms 5.8377 KOps/s 5.9567 KOps/s $\color{#d91a1a}-2.00\%$
test_values 9.1710μs 1.0528μs 949.8610 KOps/s 970.4251 KOps/s $\color{#d91a1a}-2.12\%$
test_values_nested 0.1244ms 63.6554μs 15.7096 KOps/s 16.1737 KOps/s $\color{#d91a1a}-2.87\%$
test_values_nested_locked 0.1215ms 63.3492μs 15.7855 KOps/s 16.0319 KOps/s $\color{#d91a1a}-1.54\%$
test_values_nested_leaf 0.1288ms 72.5652μs 13.7807 KOps/s 13.0599 KOps/s $\textbf{\color{#35bf28}+5.52\%}$
test_values_stack_nested 0.1248ms 64.1270μs 15.5940 KOps/s 15.9301 KOps/s $\color{#d91a1a}-2.11\%$
test_values_stack_nested_leaf 0.1312ms 72.5077μs 13.7916 KOps/s 13.9169 KOps/s $\color{#d91a1a}-0.90\%$
test_values_stack_nested_locked 0.1124ms 63.4441μs 15.7619 KOps/s 15.9349 KOps/s $\color{#d91a1a}-1.09\%$
test_membership 26.2590μs 0.8926μs 1.1203 MOps/s 1.1707 MOps/s $\color{#d91a1a}-4.31\%$
test_membership_nested 22.6620μs 2.9026μs 344.5241 KOps/s 343.8021 KOps/s $\color{#35bf28}+0.21\%$
test_membership_nested_leaf 27.7520μs 2.9322μs 341.0425 KOps/s 343.3421 KOps/s $\color{#d91a1a}-0.67\%$
test_membership_stacked_nested 41.2370μs 2.8913μs 345.8708 KOps/s 346.2548 KOps/s $\color{#d91a1a}-0.11\%$
test_membership_stacked_nested_leaf 27.3510μs 2.8870μs 346.3806 KOps/s 340.8560 KOps/s $\color{#35bf28}+1.62\%$
test_membership_nested_last 48.7610μs 4.3118μs 231.9240 KOps/s 231.3038 KOps/s $\color{#35bf28}+0.27\%$
test_membership_nested_leaf_last 24.5250μs 4.3299μs 230.9534 KOps/s 227.2274 KOps/s $\color{#35bf28}+1.64\%$
test_membership_stacked_nested_last 25.1970μs 4.2979μs 232.6722 KOps/s 194.5354 KOps/s $\textbf{\color{#35bf28}+19.60\%}$
test_membership_stacked_nested_leaf_last 48.7510μs 4.3214μs 231.4073 KOps/s 196.2936 KOps/s $\textbf{\color{#35bf28}+17.89\%}$
test_nested_getleaf 55.2540μs 10.6543μs 93.8592 KOps/s 91.6515 KOps/s $\color{#35bf28}+2.41\%$
test_nested_get 35.3460μs 10.1510μs 98.5129 KOps/s 96.8730 KOps/s $\color{#35bf28}+1.69\%$
test_stacked_getleaf 60.8830μs 10.5540μs 94.7505 KOps/s 92.9374 KOps/s $\color{#35bf28}+1.95\%$
test_stacked_get 52.3170μs 10.0078μs 99.9219 KOps/s 97.3594 KOps/s $\color{#35bf28}+2.63\%$
test_nested_getitemleaf 40.4460μs 11.2901μs 88.5733 KOps/s 88.5344 KOps/s $\color{#35bf28}+0.04\%$
test_nested_getitem 59.2200μs 10.4409μs 95.7770 KOps/s 94.8550 KOps/s $\color{#35bf28}+0.97\%$
test_stacked_getitemleaf 56.2250μs 11.3751μs 87.9113 KOps/s 88.3242 KOps/s $\color{#d91a1a}-0.47\%$
test_stacked_getitem 31.9190μs 10.4718μs 95.4947 KOps/s 95.1541 KOps/s $\color{#35bf28}+0.36\%$
test_lock_nested 4.6632ms 0.4664ms 2.1443 KOps/s 2.1824 KOps/s $\color{#d91a1a}-1.74\%$
test_lock_stack_nested 0.8984ms 0.4286ms 2.3329 KOps/s 2.3686 KOps/s $\color{#d91a1a}-1.50\%$
test_unlock_nested 0.6858ms 0.3752ms 2.6649 KOps/s 2.6657 KOps/s $\color{#d91a1a}-0.03\%$
test_unlock_stack_nested 0.6557ms 0.3457ms 2.8930 KOps/s 2.9357 KOps/s $\color{#d91a1a}-1.45\%$
test_flatten_speed 0.1963ms 0.1006ms 9.9395 KOps/s 9.8732 KOps/s $\color{#35bf28}+0.67\%$
test_unflatten_speed 0.9261ms 0.5155ms 1.9398 KOps/s 1.9073 KOps/s $\color{#35bf28}+1.70\%$
test_common_ops 1.6131ms 0.8140ms 1.2285 KOps/s 1.3105 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_creation 22.5530μs 2.5047μs 399.2526 KOps/s 404.4421 KOps/s $\color{#d91a1a}-1.28\%$
test_creation_empty 57.5280μs 12.8022μs 78.1114 KOps/s 105.0911 KOps/s $\textbf{\color{#d91a1a}-25.67\%}$
test_creation_nested_1 56.4150μs 15.6849μs 63.7557 KOps/s 80.9666 KOps/s $\textbf{\color{#d91a1a}-21.26\%}$
test_creation_nested_2 45.2440μs 20.5454μs 48.6728 KOps/s 58.7193 KOps/s $\textbf{\color{#d91a1a}-17.11\%}$
test_clone 76.0620μs 13.7746μs 72.5973 KOps/s 74.0621 KOps/s $\color{#d91a1a}-1.98\%$
test_getitem[int] 0.8363ms 12.9223μs 77.3855 KOps/s 77.7931 KOps/s $\color{#d91a1a}-0.52\%$
test_getitem[slice_int] 0.1369ms 24.6341μs 40.5941 KOps/s 41.0457 KOps/s $\color{#d91a1a}-1.10\%$
test_getitem[range] 0.2092ms 50.2676μs 19.8935 KOps/s 19.8858 KOps/s $\color{#35bf28}+0.04\%$
test_getitem[tuple] 0.1298ms 20.4085μs 48.9992 KOps/s 49.5207 KOps/s $\color{#d91a1a}-1.05\%$
test_getitem[list] 0.1840ms 46.7339μs 21.3977 KOps/s 22.1591 KOps/s $\color{#d91a1a}-3.44\%$
test_setitem_dim[int] 57.1070μs 25.7408μs 38.8489 KOps/s 38.7876 KOps/s $\color{#35bf28}+0.16\%$
test_setitem_dim[slice_int] 95.7890μs 52.6775μs 18.9834 KOps/s 19.1554 KOps/s $\color{#d91a1a}-0.90\%$
test_setitem_dim[range] 0.1245ms 74.4414μs 13.4334 KOps/s 13.4159 KOps/s $\color{#35bf28}+0.13\%$
test_setitem_dim[tuple] 0.1009ms 42.9957μs 23.2582 KOps/s 24.2896 KOps/s $\color{#d91a1a}-4.25\%$
test_setitem 83.4960μs 21.2813μs 46.9896 KOps/s 51.4188 KOps/s $\textbf{\color{#d91a1a}-8.61\%}$
test_set 71.0820μs 20.5775μs 48.5968 KOps/s 53.2569 KOps/s $\textbf{\color{#d91a1a}-8.75\%}$
test_set_shared 3.2873ms 0.1752ms 5.7092 KOps/s 5.7990 KOps/s $\color{#d91a1a}-1.55\%$
test_update 0.8085ms 23.9800μs 41.7015 KOps/s 49.1056 KOps/s $\textbf{\color{#d91a1a}-15.08\%}$
test_update_nested 0.1571ms 34.3019μs 29.1529 KOps/s 31.0525 KOps/s $\textbf{\color{#d91a1a}-6.12\%}$
test_update__nested 0.2341ms 34.7552μs 28.7727 KOps/s 28.0271 KOps/s $\color{#35bf28}+2.66\%$
test_set_nested 84.4170μs 23.1946μs 43.1135 KOps/s 46.6390 KOps/s $\textbf{\color{#d91a1a}-7.56\%}$
test_set_nested_new 95.6080μs 28.2170μs 35.4396 KOps/s 37.6005 KOps/s $\textbf{\color{#d91a1a}-5.75\%}$
test_select 0.2219ms 44.5076μs 22.4681 KOps/s 23.6392 KOps/s $\color{#d91a1a}-4.95\%$
test_select_nested 0.1283ms 62.2756μs 16.0577 KOps/s 16.0588 KOps/s $-0.01\%$
test_exclude_nested 0.1515ms 81.8531μs 12.2170 KOps/s 12.2350 KOps/s $\color{#d91a1a}-0.15\%$
test_empty[True] 0.7119ms 0.4224ms 2.3673 KOps/s 2.4200 KOps/s $\color{#d91a1a}-2.18\%$
test_empty[False] 7.2260μs 1.3663μs 731.9137 KOps/s 722.7092 KOps/s $\color{#35bf28}+1.27\%$
test_unbind_speed 0.4031ms 0.2745ms 3.6423 KOps/s 3.6876 KOps/s $\color{#d91a1a}-1.23\%$
test_unbind_speed_stack0 0.3705ms 0.2671ms 3.7441 KOps/s 3.8145 KOps/s $\color{#d91a1a}-1.84\%$
test_unbind_speed_stack1 0.1044s 0.7845ms 1.2747 KOps/s 1.3972 KOps/s $\textbf{\color{#d91a1a}-8.76\%}$
test_split 93.9314ms 1.7538ms 570.1790 Ops/s 562.4531 Ops/s $\color{#35bf28}+1.37\%$
test_chunk 1.8715ms 1.6098ms 621.2000 Ops/s 569.3409 Ops/s $\textbf{\color{#35bf28}+9.11\%}$
test_consolidate_njt[False-None] 0.1028s 8.8192ms 113.3895 Ops/s 120.0041 Ops/s $\textbf{\color{#d91a1a}-5.51\%}$
test_creation[device0] 0.2213ms 93.0396μs 10.7481 KOps/s 10.9474 KOps/s $\color{#d91a1a}-1.82\%$
test_creation_from_tensor 3.7271ms 95.8824μs 10.4294 KOps/s 10.4599 KOps/s $\color{#d91a1a}-0.29\%$
test_add_one[memmap_tensor0] 0.1343ms 5.0395μs 198.4333 KOps/s 194.2638 KOps/s $\color{#35bf28}+2.15\%$
test_contiguous[memmap_tensor0] 18.3140μs 0.5305μs 1.8849 MOps/s 1.9502 MOps/s $\color{#d91a1a}-3.35\%$
test_stack[memmap_tensor0] 31.3180μs 3.4570μs 289.2674 KOps/s 296.1339 KOps/s $\color{#d91a1a}-2.32\%$
test_memmaptd_index 1.0322ms 0.2457ms 4.0701 KOps/s 4.1978 KOps/s $\color{#d91a1a}-3.04\%$
test_memmaptd_index_astensor 0.5948ms 0.3349ms 2.9860 KOps/s 3.0809 KOps/s $\color{#d91a1a}-3.08\%$
test_memmaptd_index_op 0.9681ms 0.6185ms 1.6169 KOps/s 1.7515 KOps/s $\textbf{\color{#d91a1a}-7.69\%}$
test_serialize_model 0.1251s 0.1155s 8.6578 Ops/s 8.7976 Ops/s $\color{#d91a1a}-1.59\%$
test_serialize_model_pickle 0.4612s 0.3910s 2.5578 Ops/s 2.5199 Ops/s $\color{#35bf28}+1.50\%$
test_serialize_weights 0.1243s 0.1129s 8.8569 Ops/s 8.8145 Ops/s $\color{#35bf28}+0.48\%$
test_serialize_weights_returnearly 0.2719s 0.1821s 5.4902 Ops/s 6.2615 Ops/s $\textbf{\color{#d91a1a}-12.32\%}$
test_serialize_weights_pickle 0.6157s 0.4405s 2.2702 Ops/s 2.2623 Ops/s $\color{#35bf28}+0.35\%$
test_serialize_weights_filesystem 0.1397s 0.1374s 7.2781 Ops/s 7.0871 Ops/s $\color{#35bf28}+2.69\%$
test_serialize_model_filesystem 0.1593s 0.1460s 6.8479 Ops/s 6.1689 Ops/s $\textbf{\color{#35bf28}+11.01\%}$
test_reshape_pytree 68.1870μs 26.6009μs 37.5927 KOps/s 38.3474 KOps/s $\color{#d91a1a}-1.97\%$
test_reshape_td 68.2270μs 32.6734μs 30.6060 KOps/s 30.5358 KOps/s $\color{#35bf28}+0.23\%$
test_view_pytree 69.6260μs 26.7072μs 37.4431 KOps/s 38.1725 KOps/s $\color{#d91a1a}-1.91\%$
test_view_td 94.7970μs 38.3073μs 26.1047 KOps/s 25.6243 KOps/s $\color{#35bf28}+1.87\%$
test_unbind_pytree 63.1880μs 29.7893μs 33.5691 KOps/s 33.5191 KOps/s $\color{#35bf28}+0.15\%$
test_unbind_td 0.3183ms 40.9815μs 24.4013 KOps/s 24.9724 KOps/s $\color{#d91a1a}-2.29\%$
test_split_pytree 0.1111ms 29.5880μs 33.7975 KOps/s 34.0418 KOps/s $\color{#d91a1a}-0.72\%$
test_split_td 0.2044ms 45.2345μs 22.1070 KOps/s 21.9504 KOps/s $\color{#35bf28}+0.71\%$
test_add_pytree 98.4370μs 36.2494μs 27.5866 KOps/s 27.5514 KOps/s $\color{#35bf28}+0.13\%$
test_add_td 0.1361ms 59.9391μs 16.6836 KOps/s 18.1961 KOps/s $\textbf{\color{#d91a1a}-8.31\%}$
test_compile_add_one_nested[tensordict-compile] 0.1292ms 62.1730μs 16.0841 KOps/s 15.8602 KOps/s $\color{#35bf28}+1.41\%$
test_compile_add_one_nested[tensordict-eager] 1.5080ms 0.1704ms 5.8688 KOps/s 5.7482 KOps/s $\color{#35bf28}+2.10\%$
test_compile_add_one_nested[pytree-compile] 0.1365ms 45.3738μs 22.0391 KOps/s 21.5228 KOps/s $\color{#35bf28}+2.40\%$
test_compile_add_one_nested[pytree-eager] 0.2712ms 0.1215ms 8.2306 KOps/s 8.4330 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_copy_nested[tensordict-compile] 92.0010μs 26.3546μs 37.9440 KOps/s 38.9419 KOps/s $\color{#d91a1a}-2.56\%$
test_compile_copy_nested[tensordict-eager] 0.1231ms 59.0386μs 16.9381 KOps/s 17.1328 KOps/s $\color{#d91a1a}-1.14\%$
test_compile_copy_nested[pytree-compile] 0.1492ms 78.3206μs 12.7680 KOps/s 13.0864 KOps/s $\color{#d91a1a}-2.43\%$
test_compile_copy_nested[pytree-eager] 0.1518ms 67.0857μs 14.9063 KOps/s 14.7939 KOps/s $\color{#35bf28}+0.76\%$
test_compile_add_one_flat[tensordict-compile] 0.2232ms 0.1061ms 9.4276 KOps/s 9.5787 KOps/s $\color{#d91a1a}-1.58\%$
test_compile_add_one_flat[tensordict-eager] 0.3896ms 0.2230ms 4.4837 KOps/s 4.5613 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_add_one_flat[tensorclass-compile] 0.1007ms 45.1650μs 22.1410 KOps/s 22.5023 KOps/s $\color{#d91a1a}-1.61\%$
test_compile_add_one_flat[tensorclass-eager] 0.4830ms 65.5846μs 15.2475 KOps/s 15.2131 KOps/s $\color{#35bf28}+0.23\%$
test_compile_add_one_flat[pytree-compile] 0.2057ms 0.1046ms 9.5557 KOps/s 9.8612 KOps/s $\color{#d91a1a}-3.10\%$
test_compile_add_one_flat[pytree-eager] 0.3116ms 0.2054ms 4.8684 KOps/s 4.9212 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_add_self_flat[tensordict-eager] 0.4058ms 0.2299ms 4.3494 KOps/s 4.2300 KOps/s $\color{#35bf28}+2.82\%$
test_compile_add_self_flat[tensordict-compile] 0.1937ms 0.1076ms 9.2951 KOps/s 9.5830 KOps/s $\color{#d91a1a}-3.00\%$
test_compile_add_self_flat[tensorclass-eager] 0.2014ms 60.0565μs 16.6510 KOps/s 16.9052 KOps/s $\color{#d91a1a}-1.50\%$
test_compile_add_self_flat[tensorclass-compile] 0.1092ms 46.1325μs 21.6767 KOps/s 22.1237 KOps/s $\color{#d91a1a}-2.02\%$
test_compile_add_self_flat[pytree-eager] 0.6015ms 0.1631ms 6.1294 KOps/s 6.2033 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_add_self_flat[pytree-compile] 0.2050ms 0.1043ms 9.5846 KOps/s 9.8154 KOps/s $\color{#d91a1a}-2.35\%$
test_compile_copy_flat[tensordict-compile] 63.6590μs 21.2960μs 46.9571 KOps/s 47.4508 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_copy_flat[tensordict-eager] 0.1297ms 65.6229μs 15.2386 KOps/s 15.2541 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_copy_flat[pytree-compile] 0.1610ms 80.1088μs 12.4830 KOps/s 12.4598 KOps/s $\color{#35bf28}+0.19\%$
test_compile_copy_flat[pytree-eager] 0.1295ms 68.7149μs 14.5529 KOps/s 14.6231 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_assign_and_add[tensordict-compile] 0.3104ms 0.2082ms 4.8037 KOps/s 4.9149 KOps/s $\color{#d91a1a}-2.26\%$
test_compile_assign_and_add[tensordict-eager] 2.0382ms 1.3026ms 767.6901 Ops/s 759.7948 Ops/s $\color{#35bf28}+1.04\%$
test_compile_assign_and_add[pytree-compile] 0.2883ms 0.2058ms 4.8592 KOps/s 4.9546 KOps/s $\color{#d91a1a}-1.92\%$
test_compile_assign_and_add[pytree-eager] 1.4637ms 0.7932ms 1.2607 KOps/s 1.2864 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_assign_and_add_stack[compile] 0.5593ms 0.4607ms 2.1704 KOps/s 2.2393 KOps/s $\color{#d91a1a}-3.08\%$
test_compile_assign_and_add_stack[eager] 3.0135ms 2.7874ms 358.7535 Ops/s 376.9769 Ops/s $\color{#d91a1a}-4.83\%$
test_compile_indexing[tensor-tensordict-compile] 84.3270μs 36.2639μs 27.5756 KOps/s 28.1681 KOps/s $\color{#d91a1a}-2.10\%$
test_compile_indexing[tensor-tensordict-eager] 0.5487ms 34.8342μs 28.7074 KOps/s 29.6817 KOps/s $\color{#d91a1a}-3.28\%$
test_compile_indexing[tensor-tensorclass-compile] 82.1230μs 29.1605μs 34.2930 KOps/s 34.2297 KOps/s $\color{#35bf28}+0.18\%$
test_compile_indexing[tensor-tensorclass-eager] 72.9050μs 24.2192μs 41.2896 KOps/s 42.6383 KOps/s $\color{#d91a1a}-3.16\%$
test_compile_indexing[tensor-pytree-compile] 87.1420μs 30.0355μs 33.2939 KOps/s 33.4503 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_indexing[tensor-pytree-eager] 65.1010μs 24.2029μs 41.3174 KOps/s 42.9683 KOps/s $\color{#d91a1a}-3.84\%$
test_compile_indexing[slice-tensordict-compile] 0.1319ms 52.6470μs 18.9944 KOps/s 19.5880 KOps/s $\color{#d91a1a}-3.03\%$
test_compile_indexing[slice-tensordict-eager] 0.5694ms 20.3781μs 49.0722 KOps/s 49.7532 KOps/s $\color{#d91a1a}-1.37\%$
test_compile_indexing[slice-tensorclass-compile] 0.1008ms 45.0613μs 22.1920 KOps/s 22.9115 KOps/s $\color{#d91a1a}-3.14\%$
test_compile_indexing[slice-tensorclass-eager] 96.4800μs 18.7322μs 53.3840 KOps/s 54.5255 KOps/s $\color{#d91a1a}-2.09\%$
test_compile_indexing[slice-pytree-compile] 0.1018ms 45.5026μs 21.9768 KOps/s 22.5165 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_indexing[slice-pytree-eager] 54.7120μs 18.9824μs 52.6803 KOps/s 53.8890 KOps/s $\color{#d91a1a}-2.24\%$
test_compile_indexing[int-tensordict-compile] 0.1107ms 53.9956μs 18.5200 KOps/s 19.1803 KOps/s $\color{#d91a1a}-3.44\%$
test_compile_indexing[int-tensordict-eager] 1.0458ms 20.4271μs 48.9545 KOps/s 50.9235 KOps/s $\color{#d91a1a}-3.87\%$
test_compile_indexing[int-tensorclass-compile] 0.1099ms 45.9124μs 21.7806 KOps/s 22.4933 KOps/s $\color{#d91a1a}-3.17\%$
test_compile_indexing[int-tensorclass-eager] 78.3400μs 18.7705μs 53.2751 KOps/s 53.1059 KOps/s $\color{#35bf28}+0.32\%$
test_compile_indexing[int-pytree-compile] 88.1250μs 45.5957μs 21.9319 KOps/s 22.4309 KOps/s $\color{#d91a1a}-2.22\%$
test_compile_indexing[int-pytree-eager] 85.1990μs 19.2910μs 51.8376 KOps/s 54.1255 KOps/s $\color{#d91a1a}-4.23\%$
test_mod_add[eager] 0.1192ms 35.2344μs 28.3813 KOps/s 29.2698 KOps/s $\color{#d91a1a}-3.04\%$
test_mod_add[compile] 0.1369ms 48.8222μs 20.4825 KOps/s 20.4791 KOps/s $\color{#35bf28}+0.02\%$
test_mod_add[compile-overhead] 0.1397ms 47.8877μs 20.8822 KOps/s 20.5729 KOps/s $\color{#35bf28}+1.50\%$
test_mod_wrap[eager] 0.4745ms 0.2254ms 4.4375 KOps/s 4.3704 KOps/s $\color{#35bf28}+1.53\%$
test_mod_wrap[compile] 0.2934ms 0.2107ms 4.7468 KOps/s 4.7816 KOps/s $\color{#d91a1a}-0.73\%$
test_mod_wrap[compile-overhead] 0.3668ms 0.2073ms 4.8232 KOps/s 4.7675 KOps/s $\color{#35bf28}+1.17\%$
test_mod_wrap_and_backward[eager] 13.1533ms 11.1401ms 89.7656 Ops/s 84.3023 Ops/s $\textbf{\color{#35bf28}+6.48\%}$
test_mod_wrap_and_backward[compile] 12.5266ms 10.8146ms 92.4679 Ops/s 73.2740 Ops/s $\textbf{\color{#35bf28}+26.19\%}$
test_mod_wrap_and_backward[compile-overhead] 12.2446ms 10.7905ms 92.6737 Ops/s 72.8180 Ops/s $\textbf{\color{#35bf28}+27.27\%}$
test_seq_add[eager] 0.2419ms 0.1177ms 8.4935 KOps/s 8.3791 KOps/s $\color{#35bf28}+1.36\%$
test_seq_add[compile] 0.1217ms 63.3450μs 15.7866 KOps/s 16.0727 KOps/s $\color{#d91a1a}-1.78\%$
test_seq_add[compile-overhead] 0.1502ms 61.7941μs 16.1828 KOps/s 16.6604 KOps/s $\color{#d91a1a}-2.87\%$
test_seq_wrap[eager] 0.6464ms 0.4531ms 2.2070 KOps/s 2.2240 KOps/s $\color{#d91a1a}-0.76\%$
test_seq_wrap[compile] 0.3575ms 0.2290ms 4.3669 KOps/s 4.2516 KOps/s $\color{#35bf28}+2.71\%$
test_seq_wrap[compile-overhead] 0.4354ms 0.2287ms 4.3717 KOps/s 4.2561 KOps/s $\color{#35bf28}+2.72\%$
test_func_call_runtime[False-eager] 0.9088ms 0.5404ms 1.8505 KOps/s 1.7545 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_func_call_runtime[False-compile] 0.5923ms 0.4295ms 2.3281 KOps/s 2.2991 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_runtime[False-compile-overhead] 0.5823ms 0.4265ms 2.3447 KOps/s 2.2857 KOps/s $\color{#35bf28}+2.58\%$
test_func_call_runtime[True-eager] 1.2989ms 0.7578ms 1.3196 KOps/s 1.2619 KOps/s $\color{#35bf28}+4.58\%$
test_func_call_runtime[True-compile] 0.6943ms 0.4678ms 2.1378 KOps/s 2.1026 KOps/s $\color{#35bf28}+1.67\%$
test_func_call_runtime[True-compile-overhead] 0.7418ms 0.4706ms 2.1248 KOps/s 2.0960 KOps/s $\color{#35bf28}+1.37\%$
test_func_call_cm_runtime[False-eager] 0.9406ms 0.5447ms 1.8358 KOps/s 1.7713 KOps/s $\color{#35bf28}+3.64\%$
test_func_call_cm_runtime[False-compile] 0.8283ms 0.4309ms 2.3205 KOps/s 2.3049 KOps/s $\color{#35bf28}+0.68\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8290ms 0.4337ms 2.3059 KOps/s 2.3013 KOps/s $\color{#35bf28}+0.20\%$
test_func_call_cm_runtime[True-eager] 1.5034ms 0.9083ms 1.1010 KOps/s 1.0757 KOps/s $\color{#35bf28}+2.35\%$
test_func_call_cm_runtime[True-compile] 0.9447ms 0.4985ms 2.0061 KOps/s 1.9818 KOps/s $\color{#35bf28}+1.22\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5949ms 0.4897ms 2.0419 KOps/s 2.0034 KOps/s $\color{#35bf28}+1.92\%$
test_vmap_func_call_cm_runtime[eager] 3.1152ms 1.9351ms 516.7607 Ops/s 517.1543 Ops/s $\color{#d91a1a}-0.08\%$
test_vmap_func_call_cm_runtime[compile] 0.9665ms 0.5233ms 1.9111 KOps/s 1.9116 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.1429ms 0.5331ms 1.8759 KOps/s 1.9074 KOps/s $\color{#d91a1a}-1.66\%$
test_distributed 0.2883ms 0.1265ms 7.9082 KOps/s 7.7809 KOps/s $\color{#35bf28}+1.64\%$
test_tdmodule 56.4860μs 27.3286μs 36.5917 KOps/s 39.7138 KOps/s $\textbf{\color{#d91a1a}-7.86\%}$
test_tdmodule_dispatch 80.4210μs 50.2795μs 19.8888 KOps/s 22.1260 KOps/s $\textbf{\color{#d91a1a}-10.11\%}$
test_tdseq 50.0030μs 30.6262μs 32.6518 KOps/s 35.4519 KOps/s $\textbf{\color{#d91a1a}-7.90\%}$
test_tdseq_dispatch 0.1068ms 58.2976μs 17.1534 KOps/s 19.5199 KOps/s $\textbf{\color{#d91a1a}-12.12\%}$
test_instantiation_functorch 2.3837ms 1.5493ms 645.4660 Ops/s 632.3847 Ops/s $\color{#35bf28}+2.07\%$
test_exec_functorch 0.4128ms 0.1842ms 5.4294 KOps/s 5.4035 KOps/s $\color{#35bf28}+0.48\%$
test_exec_functional_call 0.2878ms 0.1758ms 5.6874 KOps/s 5.5133 KOps/s $\color{#35bf28}+3.16\%$
test_exec_td_decorator 0.5327ms 0.2350ms 4.2560 KOps/s 4.2081 KOps/s $\color{#35bf28}+1.14\%$
test_vmap_mlp_speed_decorator[True-True] 1.1178ms 0.6672ms 1.4988 KOps/s 1.5186 KOps/s $\color{#d91a1a}-1.31\%$
test_vmap_mlp_speed_decorator[True-False] 1.3201ms 0.6745ms 1.4826 KOps/s 1.5173 KOps/s $\color{#d91a1a}-2.28\%$
test_vmap_mlp_speed_decorator[False-True] 0.7205ms 0.5321ms 1.8792 KOps/s 1.8582 KOps/s $\color{#35bf28}+1.13\%$
test_vmap_mlp_speed_decorator[False-False] 0.9188ms 0.5351ms 1.8690 KOps/s 1.8578 KOps/s $\color{#35bf28}+0.60\%$
test_to_module_speed[True] 1.6230ms 1.3353ms 748.8684 Ops/s 735.0132 Ops/s $\color{#35bf28}+1.89\%$
test_to_module_speed[False] 1.6928ms 1.3001ms 769.1958 Ops/s 754.6603 Ops/s $\color{#35bf28}+1.93\%$
test_tc_init 0.1082ms 51.5966μs 19.3811 KOps/s 22.0215 KOps/s $\textbf{\color{#d91a1a}-11.99\%}$
test_tc_init_nested 0.2366ms 0.1023ms 9.7770 KOps/s 11.0303 KOps/s $\textbf{\color{#d91a1a}-11.36\%}$
test_tc_first_layer_tensor 23.6440μs 1.5197μs 658.0127 KOps/s 668.0268 KOps/s $\color{#d91a1a}-1.50\%$
test_tc_first_layer_nontensor 26.1890μs 4.6386μs 215.5822 KOps/s 210.5022 KOps/s $\color{#35bf28}+2.41\%$
test_tc_second_layer_tensor 18.8150μs 2.7923μs 358.1335 KOps/s 353.9820 KOps/s $\color{#35bf28}+1.17\%$
test_tc_second_layer_nontensor 29.8750μs 5.9690μs 167.5327 KOps/s 162.5379 KOps/s $\color{#35bf28}+3.07\%$
test_unbind 0.2125s 13.4114ms 74.5635 Ops/s 70.7291 Ops/s $\textbf{\color{#35bf28}+5.42\%}$
test_full_like 15.7327ms 11.4754ms 87.1427 Ops/s 146.0968 Ops/s $\textbf{\color{#d91a1a}-40.35\%}$
test_zeros_like 13.1565ms 7.4908ms 133.4977 Ops/s 375.3555 Ops/s $\textbf{\color{#d91a1a}-64.43\%}$
test_ones_like 15.1283ms 7.7716ms 128.6730 Ops/s 323.3975 Ops/s $\textbf{\color{#d91a1a}-60.21\%}$
test_clone 14.3852ms 9.1749ms 108.9929 Ops/s 208.8154 Ops/s $\textbf{\color{#d91a1a}-47.80\%}$
test_squeeze 80.0200μs 11.8159μs 84.6319 KOps/s 80.4650 KOps/s $\textbf{\color{#35bf28}+5.18\%}$
test_unsqueeze 0.2916ms 92.6916μs 10.7885 KOps/s 10.4923 KOps/s $\color{#35bf28}+2.82\%$
test_split 0.4100ms 0.1967ms 5.0849 KOps/s 4.9661 KOps/s $\color{#35bf28}+2.39\%$
test_permute 0.3614ms 0.2064ms 4.8441 KOps/s 4.7651 KOps/s $\color{#35bf28}+1.66\%$
test_stack 28.2167ms 24.5854ms 40.6746 Ops/s 41.6861 Ops/s $\color{#d91a1a}-2.43\%$
test_cat 28.9677ms 24.4968ms 40.8216 Ops/s 42.1321 Ops/s $\color{#d91a1a}-3.11\%$

Copy link

github-actions bot commented Dec 18, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}44$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 31.8300μs 11.5365μs 86.6814 KOps/s 75.9189 KOps/s $\textbf{\color{#35bf28}+14.18\%}$
test_plain_set_stack_nested 35.2110μs 11.6256μs 86.0172 KOps/s 74.3207 KOps/s $\textbf{\color{#35bf28}+15.74\%}$
test_plain_set_nested_inplace 41.7500μs 12.4259μs 80.4774 KOps/s 70.3751 KOps/s $\textbf{\color{#35bf28}+14.35\%}$
test_plain_set_stack_nested_inplace 46.0010μs 12.4728μs 80.1745 KOps/s 70.2124 KOps/s $\textbf{\color{#35bf28}+14.19\%}$
test_items 27.4700μs 2.8629μs 349.3013 KOps/s 344.7788 KOps/s $\color{#35bf28}+1.31\%$
test_items_nested 0.4095ms 0.3617ms 2.7645 KOps/s 2.7675 KOps/s $\color{#d91a1a}-0.11\%$
test_items_nested_locked 0.4270ms 0.3640ms 2.7475 KOps/s 2.7447 KOps/s $\color{#35bf28}+0.10\%$
test_items_nested_leaf 93.6710μs 58.0115μs 17.2380 KOps/s 17.1098 KOps/s $\color{#35bf28}+0.75\%$
test_items_stack_nested 0.4324ms 0.3599ms 2.7783 KOps/s 2.7751 KOps/s $\color{#35bf28}+0.11\%$
test_items_stack_nested_leaf 85.2110μs 57.9536μs 17.2552 KOps/s 16.6792 KOps/s $\color{#35bf28}+3.45\%$
test_items_stack_nested_locked 0.4274ms 0.3577ms 2.7955 KOps/s 2.7357 KOps/s $\color{#35bf28}+2.19\%$
test_keys 24.3510μs 3.4305μs 291.4990 KOps/s 288.6302 KOps/s $\color{#35bf28}+0.99\%$
test_keys_nested 0.1187ms 81.9190μs 12.2072 KOps/s 12.2632 KOps/s $\color{#d91a1a}-0.46\%$
test_keys_nested_locked 0.7165ms 86.7991μs 11.5209 KOps/s 11.5212 KOps/s $-0.00\%$
test_keys_nested_leaf 0.1082ms 72.4267μs 13.8071 KOps/s 13.8138 KOps/s $\color{#d91a1a}-0.05\%$
test_keys_stack_nested 0.1204ms 81.6618μs 12.2456 KOps/s 12.2688 KOps/s $\color{#d91a1a}-0.19\%$
test_keys_stack_nested_leaf 0.1080ms 72.8548μs 13.7259 KOps/s 13.6252 KOps/s $\color{#35bf28}+0.74\%$
test_keys_stack_nested_locked 0.1281ms 87.9497μs 11.3701 KOps/s 11.5555 KOps/s $\color{#d91a1a}-1.60\%$
test_values 4.0521μs 0.8485μs 1.1786 MOps/s 1.1781 MOps/s $\color{#35bf28}+0.05\%$
test_values_nested 69.4810μs 34.8165μs 28.7220 KOps/s 29.2037 KOps/s $\color{#d91a1a}-1.65\%$
test_values_nested_locked 65.1600μs 37.0682μs 26.9773 KOps/s 27.6593 KOps/s $\color{#d91a1a}-2.47\%$
test_values_nested_leaf 0.1816ms 39.8477μs 25.0955 KOps/s 25.5484 KOps/s $\color{#d91a1a}-1.77\%$
test_values_stack_nested 63.1210μs 34.9267μs 28.6314 KOps/s 29.1514 KOps/s $\color{#d91a1a}-1.78\%$
test_values_stack_nested_leaf 72.7910μs 39.7728μs 25.1428 KOps/s 25.3813 KOps/s $\color{#d91a1a}-0.94\%$
test_values_stack_nested_locked 65.3300μs 36.7667μs 27.1985 KOps/s 27.1914 KOps/s $\color{#35bf28}+0.03\%$
test_membership 1.9165μs 0.5143μs 1.9442 MOps/s 1.9442 MOps/s $+0.00\%$
test_membership_nested 17.1750μs 1.9552μs 511.4509 KOps/s 509.0174 KOps/s $\color{#35bf28}+0.48\%$
test_membership_nested_leaf 16.2855μs 1.9833μs 504.2000 KOps/s 505.1205 KOps/s $\color{#d91a1a}-0.18\%$
test_membership_stacked_nested 27.4400μs 1.9960μs 500.9954 KOps/s 481.3560 KOps/s $\color{#35bf28}+4.08\%$
test_membership_stacked_nested_leaf 30.2100μs 2.0128μs 496.8158 KOps/s 479.8088 KOps/s $\color{#35bf28}+3.54\%$
test_membership_nested_last 25.7600μs 2.9804μs 335.5287 KOps/s 330.7205 KOps/s $\color{#35bf28}+1.45\%$
test_membership_nested_leaf_last 30.4400μs 3.0115μs 332.0569 KOps/s 331.4262 KOps/s $\color{#35bf28}+0.19\%$
test_membership_stacked_nested_last 32.6600μs 3.0446μs 328.4475 KOps/s 121.3527 KOps/s $\textbf{\color{#35bf28}+170.66\%}$
test_membership_stacked_nested_leaf_last 34.3100μs 3.0992μs 322.6592 KOps/s 120.4812 KOps/s $\textbf{\color{#35bf28}+167.81\%}$
test_nested_getleaf 42.1400μs 6.1630μs 162.2594 KOps/s 161.1854 KOps/s $\color{#35bf28}+0.67\%$
test_nested_get 29.9500μs 5.8099μs 172.1201 KOps/s 170.1450 KOps/s $\color{#35bf28}+1.16\%$
test_stacked_getleaf 37.3810μs 6.1919μs 161.5026 KOps/s 162.7523 KOps/s $\color{#d91a1a}-0.77\%$
test_stacked_get 40.2210μs 5.8276μs 171.5960 KOps/s 171.7953 KOps/s $\color{#d91a1a}-0.12\%$
test_nested_getitemleaf 36.1610μs 6.2227μs 160.7025 KOps/s 158.8154 KOps/s $\color{#35bf28}+1.19\%$
test_nested_getitem 34.4900μs 5.9531μs 167.9785 KOps/s 164.8474 KOps/s $\color{#35bf28}+1.90\%$
test_stacked_getitemleaf 31.8500μs 6.3491μs 157.5019 KOps/s 158.6990 KOps/s $\color{#d91a1a}-0.75\%$
test_stacked_getitem 38.0710μs 5.9430μs 168.2653 KOps/s 168.4355 KOps/s $\color{#d91a1a}-0.10\%$
test_lock_nested 9.4422ms 0.3918ms 2.5522 KOps/s 2.6205 KOps/s $\color{#d91a1a}-2.61\%$
test_lock_stack_nested 0.3968ms 0.3463ms 2.8879 KOps/s 2.9743 KOps/s $\color{#d91a1a}-2.91\%$
test_unlock_nested 0.6158ms 0.3143ms 3.1814 KOps/s 3.2088 KOps/s $\color{#d91a1a}-0.86\%$
test_unlock_stack_nested 0.3348ms 0.2828ms 3.5358 KOps/s 3.6343 KOps/s $\color{#d91a1a}-2.71\%$
test_flatten_speed 0.1135ms 76.0581μs 13.1478 KOps/s 13.1336 KOps/s $\color{#35bf28}+0.11\%$
test_unflatten_speed 0.3754ms 0.3223ms 3.1031 KOps/s 3.1277 KOps/s $\color{#d91a1a}-0.79\%$
test_common_ops 1.6223ms 0.5737ms 1.7431 KOps/s 1.6138 KOps/s $\textbf{\color{#35bf28}+8.01\%}$
test_creation 0.1773ms 1.7168μs 582.4943 KOps/s 584.8912 KOps/s $\color{#d91a1a}-0.41\%$
test_creation_empty 36.2310μs 6.7709μs 147.6912 KOps/s 101.3763 KOps/s $\textbf{\color{#35bf28}+45.69\%}$
test_creation_nested_1 34.4910μs 8.4004μs 119.0414 KOps/s 85.2224 KOps/s $\textbf{\color{#35bf28}+39.68\%}$
test_creation_nested_2 33.7600μs 11.1247μs 89.8900 KOps/s 70.4545 KOps/s $\textbf{\color{#35bf28}+27.59\%}$
test_clone 48.2910μs 9.7304μs 102.7704 KOps/s 102.1360 KOps/s $\color{#35bf28}+0.62\%$
test_getitem[int] 1.6022ms 10.5777μs 94.5387 KOps/s 95.5557 KOps/s $\color{#d91a1a}-1.06\%$
test_getitem[slice_int] 92.4989ms 29.0792μs 34.3889 KOps/s 49.5912 KOps/s $\textbf{\color{#d91a1a}-30.66\%}$
test_getitem[range] 0.1261ms 35.9358μs 27.8274 KOps/s 28.1977 KOps/s $\color{#d91a1a}-1.31\%$
test_getitem[tuple] 0.1034ms 17.6316μs 56.7163 KOps/s 55.2112 KOps/s $\color{#35bf28}+2.73\%$
test_getitem[list] 0.2325ms 31.7809μs 31.4654 KOps/s 31.9408 KOps/s $\color{#d91a1a}-1.49\%$
test_setitem_dim[int] 46.1200μs 17.6093μs 56.7881 KOps/s 56.4113 KOps/s $\color{#35bf28}+0.67\%$
test_setitem_dim[slice_int] 59.1510μs 36.9392μs 27.0715 KOps/s 27.6661 KOps/s $\color{#d91a1a}-2.15\%$
test_setitem_dim[range] 82.0810μs 50.5946μs 19.7650 KOps/s 20.1840 KOps/s $\color{#d91a1a}-2.08\%$
test_setitem_dim[tuple] 52.6910μs 29.6849μs 33.6871 KOps/s 32.3427 KOps/s $\color{#35bf28}+4.16\%$
test_setitem 92.9510μs 13.1494μs 76.0492 KOps/s 66.3968 KOps/s $\textbf{\color{#35bf28}+14.54\%}$
test_set 93.9010μs 12.7397μs 78.4949 KOps/s 68.3891 KOps/s $\textbf{\color{#35bf28}+14.78\%}$
test_set_shared 1.5139ms 0.1494ms 6.6950 KOps/s 6.8248 KOps/s $\color{#d91a1a}-1.90\%$
test_update 0.4843ms 15.2095μs 65.7484 KOps/s 54.3369 KOps/s $\textbf{\color{#35bf28}+21.00\%}$
test_update_nested 92.1510μs 20.1265μs 49.6857 KOps/s 40.5151 KOps/s $\textbf{\color{#35bf28}+22.64\%}$
test_update__nested 1.0157ms 24.0740μs 41.5385 KOps/s 42.0905 KOps/s $\color{#d91a1a}-1.31\%$
test_set_nested 83.1210μs 14.2391μs 70.2291 KOps/s 63.6181 KOps/s $\textbf{\color{#35bf28}+10.39\%}$
test_set_nested_new 91.6810μs 16.2771μs 61.4360 KOps/s 54.3200 KOps/s $\textbf{\color{#35bf28}+13.10\%}$
test_select 0.1092ms 29.4089μs 34.0033 KOps/s 33.0807 KOps/s $\color{#35bf28}+2.79\%$
test_select_nested 81.0710μs 43.2063μs 23.1448 KOps/s 22.7806 KOps/s $\color{#35bf28}+1.60\%$
test_exclude_nested 86.3410μs 61.3504μs 16.2998 KOps/s 15.9524 KOps/s $\color{#35bf28}+2.18\%$
test_empty[True] 0.3214ms 0.2858ms 3.4995 KOps/s 3.4407 KOps/s $\color{#35bf28}+1.71\%$
test_empty[False] 3.2920μs 0.8285μs 1.2070 MOps/s 1.2151 MOps/s $\color{#d91a1a}-0.67\%$
test_to 84.1110μs 55.3234μs 18.0755 KOps/s 18.2756 KOps/s $\color{#d91a1a}-1.09\%$
test_to_nonblocking 91.5510μs 47.5690μs 21.0221 KOps/s 21.4671 KOps/s $\color{#d91a1a}-2.07\%$
test_unbind_speed 0.7763ms 0.2349ms 4.2572 KOps/s 4.2369 KOps/s $\color{#35bf28}+0.48\%$
test_unbind_speed_stack0 0.3148ms 0.2364ms 4.2307 KOps/s 4.3035 KOps/s $\color{#d91a1a}-1.69\%$
test_unbind_speed_stack1 92.6779ms 0.6645ms 1.5048 KOps/s 1.5159 KOps/s $\color{#d91a1a}-0.73\%$
test_split 93.6078ms 1.5816ms 632.2899 Ops/s 589.2031 Ops/s $\textbf{\color{#35bf28}+7.31\%}$
test_chunk 96.3926ms 1.7336ms 576.8188 Ops/s 699.9203 Ops/s $\textbf{\color{#d91a1a}-17.59\%}$
test_consolidate[False-None] 2.8109ms 2.6995ms 370.4403 Ops/s 338.3843 Ops/s $\textbf{\color{#35bf28}+9.47\%}$
test_consolidate[default-None] 1.7771ms 1.6580ms 603.1487 Ops/s 603.4071 Ops/s $\color{#d91a1a}-0.04\%$
test_consolidate[reduce-overhead-None] 1.7980ms 1.6932ms 590.6104 Ops/s 587.6905 Ops/s $\color{#35bf28}+0.50\%$
test_consolidate_njt[False-None] 6.6954ms 6.4060ms 156.1047 Ops/s 113.0592 Ops/s $\textbf{\color{#35bf28}+38.07\%}$
test_to[False-False-None] 1.8061ms 1.7096ms 584.9229 Ops/s 577.9310 Ops/s $\color{#35bf28}+1.21\%$
test_to[True-False-None] 1.5967ms 1.3213ms 756.8447 Ops/s 761.9630 Ops/s $\color{#d91a1a}-0.67\%$
test_to[within-False-None] 4.3529ms 4.1931ms 238.4851 Ops/s 241.2899 Ops/s $\color{#d91a1a}-1.16\%$
test_to[True-default-None] 5.7313ms 5.2969ms 188.7903 Ops/s 185.9608 Ops/s $\color{#35bf28}+1.52\%$
test_to_njt[False-False-None] 7.1372ms 6.8652ms 145.6619 Ops/s 138.6315 Ops/s $\textbf{\color{#35bf28}+5.07\%}$
test_to_njt[True-False-None] 5.6790ms 5.4797ms 182.4925 Ops/s 170.2447 Ops/s $\textbf{\color{#35bf28}+7.19\%}$
test_to_njt[within-False-None] 12.2037ms 11.9097ms 83.9654 Ops/s 78.1977 Ops/s $\textbf{\color{#35bf28}+7.38\%}$
test_creation[device0] 0.4837ms 78.0010μs 12.8203 KOps/s 12.3361 KOps/s $\color{#35bf28}+3.93\%$
test_creation_from_tensor 0.5471ms 81.9047μs 12.2093 KOps/s 11.7471 KOps/s $\color{#35bf28}+3.93\%$
test_add_one[memmap_tensor0] 0.4109ms 6.1298μs 163.1375 KOps/s 169.3449 KOps/s $\color{#d91a1a}-3.67\%$
test_contiguous[memmap_tensor0] 1.9975μs 0.4024μs 2.4852 MOps/s 2.4746 MOps/s $\color{#35bf28}+0.43\%$
test_stack[memmap_tensor0] 23.7300μs 4.2948μs 232.8403 KOps/s 235.9435 KOps/s $\color{#d91a1a}-1.32\%$
test_memmaptd_index 1.7216ms 0.2458ms 4.0685 KOps/s 3.8500 KOps/s $\textbf{\color{#35bf28}+5.68\%}$
test_memmaptd_index_astensor 0.8196ms 0.3078ms 3.2487 KOps/s 3.2516 KOps/s $\color{#d91a1a}-0.09\%$
test_memmaptd_index_op 0.9686ms 0.5343ms 1.8715 KOps/s 1.6995 KOps/s $\textbf{\color{#35bf28}+10.12\%}$
test_serialize_model 0.1311s 0.1297s 7.7107 Ops/s 7.6475 Ops/s $\color{#35bf28}+0.83\%$
test_serialize_model_pickle 1.3789s 1.2215s 0.8187 Ops/s 0.8408 Ops/s $\color{#d91a1a}-2.63\%$
test_serialize_weights 0.4359s 0.1739s 5.7517 Ops/s 7.6982 Ops/s $\textbf{\color{#d91a1a}-25.28\%}$
test_serialize_weights_returnearly 0.3406s 53.8358ms 18.5750 Ops/s 14.9864 Ops/s $\textbf{\color{#35bf28}+23.95\%}$
test_serialize_weights_pickle 1.3663s 1.2177s 0.8212 Ops/s 0.8229 Ops/s $\color{#d91a1a}-0.21\%$
test_reshape_pytree 54.7400μs 21.6667μs 46.1538 KOps/s 45.3976 KOps/s $\color{#35bf28}+1.67\%$
test_reshape_td 56.7200μs 25.5240μs 39.1788 KOps/s 36.5623 KOps/s $\textbf{\color{#35bf28}+7.16\%}$
test_view_pytree 58.1900μs 21.4968μs 46.5185 KOps/s 45.9099 KOps/s $\color{#35bf28}+1.33\%$
test_view_td 56.9910μs 29.2620μs 34.1741 KOps/s 32.2195 KOps/s $\textbf{\color{#35bf28}+6.07\%}$
test_unbind_pytree 56.7410μs 28.0456μs 35.6563 KOps/s 36.4826 KOps/s $\color{#d91a1a}-2.27\%$
test_unbind_td 0.7468ms 36.7736μs 27.1934 KOps/s 28.1247 KOps/s $\color{#d91a1a}-3.31\%$
test_split_pytree 56.8310μs 30.4582μs 32.8319 KOps/s 33.8506 KOps/s $\color{#d91a1a}-3.01\%$
test_split_td 0.9148ms 38.6956μs 25.8427 KOps/s 26.3790 KOps/s $\color{#d91a1a}-2.03\%$
test_add_pytree 71.0710μs 33.2015μs 30.1191 KOps/s 30.6251 KOps/s $\color{#d91a1a}-1.65\%$
test_add_td 0.1006ms 47.4132μs 21.0912 KOps/s 18.6816 KOps/s $\textbf{\color{#35bf28}+12.90\%}$
test_compile_add_one_nested[tensordict-compile] 0.1682ms 0.1170ms 8.5474 KOps/s 8.0917 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_compile_add_one_nested[tensordict-eager] 0.2182ms 0.1300ms 7.6906 KOps/s 7.6596 KOps/s $\color{#35bf28}+0.41\%$
test_compile_add_one_nested[pytree-compile] 0.2048ms 93.4081μs 10.7057 KOps/s 10.5897 KOps/s $\color{#35bf28}+1.10\%$
test_compile_add_one_nested[pytree-eager] 1.3786ms 0.1469ms 6.8090 KOps/s 6.6431 KOps/s $\color{#35bf28}+2.50\%$
test_compile_copy_nested[tensordict-compile] 64.8610μs 23.0413μs 43.4003 KOps/s 44.9489 KOps/s $\color{#d91a1a}-3.45\%$
test_compile_copy_nested[tensordict-eager] 61.4610μs 29.2733μs 34.1609 KOps/s 34.0204 KOps/s $\color{#35bf28}+0.41\%$
test_compile_copy_nested[pytree-compile] 0.3612ms 64.4536μs 15.5150 KOps/s 15.3358 KOps/s $\color{#35bf28}+1.17\%$
test_compile_copy_nested[pytree-eager] 85.3410μs 48.5270μs 20.6071 KOps/s 20.1437 KOps/s $\color{#35bf28}+2.30\%$
test_compile_add_one_flat[tensordict-compile] 0.1982ms 0.1400ms 7.1407 KOps/s 7.1824 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_add_one_flat[tensordict-eager] 0.3052ms 0.2149ms 4.6533 KOps/s 4.7124 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_add_one_flat[tensorclass-compile] 0.1552ms 96.0676μs 10.4093 KOps/s 10.3099 KOps/s $\color{#35bf28}+0.96\%$
test_compile_add_one_flat[tensorclass-eager] 0.1078ms 52.3240μs 19.1117 KOps/s 18.3835 KOps/s $\color{#35bf28}+3.96\%$
test_compile_add_one_flat[pytree-compile] 0.2564ms 0.1347ms 7.4262 KOps/s 7.1213 KOps/s $\color{#35bf28}+4.28\%$
test_compile_add_one_flat[pytree-eager] 0.5365ms 0.4694ms 2.1302 KOps/s 2.0750 KOps/s $\color{#35bf28}+2.66\%$
test_compile_add_self_flat[tensordict-eager] 0.3774ms 0.2595ms 3.8541 KOps/s 3.9019 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_add_self_flat[tensordict-compile] 0.2362ms 0.1441ms 6.9406 KOps/s 7.2237 KOps/s $\color{#d91a1a}-3.92\%$
test_compile_add_self_flat[tensorclass-eager] 0.1430ms 64.7464μs 15.4449 KOps/s 15.5796 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_add_self_flat[tensorclass-compile] 0.1330ms 97.0894μs 10.2998 KOps/s 9.9554 KOps/s $\color{#35bf28}+3.46\%$
test_compile_add_self_flat[pytree-eager] 0.4649ms 0.4024ms 2.4853 KOps/s 2.4354 KOps/s $\color{#35bf28}+2.05\%$
test_compile_add_self_flat[pytree-compile] 0.1800ms 0.1327ms 7.5365 KOps/s 7.4752 KOps/s $\color{#35bf28}+0.82\%$
test_compile_copy_flat[tensordict-compile] 49.9210μs 19.4632μs 51.3791 KOps/s 51.6743 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_copy_flat[tensordict-eager] 80.3010μs 32.1188μs 31.1344 KOps/s 32.4944 KOps/s $\color{#d91a1a}-4.19\%$
test_compile_copy_flat[pytree-compile] 0.2262ms 70.7335μs 14.1376 KOps/s 13.9958 KOps/s $\color{#35bf28}+1.01\%$
test_compile_copy_flat[pytree-eager] 85.8110μs 51.9387μs 19.2535 KOps/s 19.4508 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_assign_and_add[tensordict-compile] 1.6157ms 0.3876ms 2.5802 KOps/s 2.2549 KOps/s $\textbf{\color{#35bf28}+14.43\%}$
test_compile_assign_and_add[tensordict-eager] 2.9101ms 2.6001ms 384.6048 Ops/s 378.9625 Ops/s $\color{#35bf28}+1.49\%$
test_compile_assign_and_add[pytree-compile] 1.5788ms 0.3782ms 2.6442 KOps/s 2.2329 KOps/s $\textbf{\color{#35bf28}+18.42\%}$
test_compile_assign_and_add[pytree-eager] 2.6646ms 2.5559ms 391.2584 Ops/s 382.8438 Ops/s $\color{#35bf28}+2.20\%$
test_compile_indexing[tensor-tensordict-compile] 0.6791ms 0.1100ms 9.0912 KOps/s 8.9042 KOps/s $\color{#35bf28}+2.10\%$
test_compile_indexing[tensor-tensordict-eager] 0.5653ms 75.0006μs 13.3332 KOps/s 12.7763 KOps/s $\color{#35bf28}+4.36\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5097ms 0.1036ms 9.6489 KOps/s 9.6685 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1108ms 65.6694μs 15.2278 KOps/s 14.9550 KOps/s $\color{#35bf28}+1.82\%$
test_compile_indexing[tensor-pytree-compile] 1.0791ms 0.1045ms 9.5669 KOps/s 9.0355 KOps/s $\textbf{\color{#35bf28}+5.88\%}$
test_compile_indexing[tensor-pytree-eager] 0.4539ms 65.7063μs 15.2192 KOps/s 13.9731 KOps/s $\textbf{\color{#35bf28}+8.92\%}$
test_compile_indexing[slice-tensordict-compile] 0.1616ms 98.4602μs 10.1564 KOps/s 10.0564 KOps/s $\color{#35bf28}+0.99\%$
test_compile_indexing[slice-tensordict-eager] 0.4068ms 16.9414μs 59.0270 KOps/s 58.5370 KOps/s $\color{#35bf28}+0.84\%$
test_compile_indexing[slice-tensorclass-compile] 0.4963ms 93.9852μs 10.6400 KOps/s 10.4675 KOps/s $\color{#35bf28}+1.65\%$
test_compile_indexing[slice-tensorclass-eager] 49.0310μs 15.9671μs 62.6288 KOps/s 62.0597 KOps/s $\color{#35bf28}+0.92\%$
test_compile_indexing[slice-pytree-compile] 0.4938ms 98.4108μs 10.1615 KOps/s 10.4718 KOps/s $\color{#d91a1a}-2.96\%$
test_compile_indexing[slice-pytree-eager] 0.3972ms 16.3005μs 61.3477 KOps/s 63.8963 KOps/s $\color{#d91a1a}-3.99\%$
test_compile_indexing[int-tensordict-compile] 0.4895ms 0.1016ms 9.8413 KOps/s 9.9525 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_indexing[int-tensordict-eager] 0.5797ms 16.7535μs 59.6889 KOps/s 58.9920 KOps/s $\color{#35bf28}+1.18\%$
test_compile_indexing[int-tensorclass-compile] 0.4961ms 97.5043μs 10.2560 KOps/s 10.4564 KOps/s $\color{#d91a1a}-1.92\%$
test_compile_indexing[int-tensorclass-eager] 0.1604ms 16.7265μs 59.7853 KOps/s 63.3300 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_compile_indexing[int-pytree-compile] 0.4942ms 97.7994μs 10.2250 KOps/s 10.4737 KOps/s $\color{#d91a1a}-2.37\%$
test_compile_indexing[int-pytree-eager] 0.4032ms 17.4841μs 57.1948 KOps/s 63.9671 KOps/s $\textbf{\color{#d91a1a}-10.59\%}$
test_mod_add[eager] 0.4330ms 37.4718μs 26.6867 KOps/s 26.4948 KOps/s $\color{#35bf28}+0.72\%$
test_mod_add[compile] 0.1420ms 82.6933μs 12.0929 KOps/s 12.3534 KOps/s $\color{#d91a1a}-2.11\%$
test_mod_add[compile-overhead] 0.3234ms 0.1634ms 6.1183 KOps/s 5.7425 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_mod_wrap[eager] 0.6521ms 0.2368ms 4.2232 KOps/s 3.8336 KOps/s $\textbf{\color{#35bf28}+10.16\%}$
test_mod_wrap[compile] 0.3654ms 0.2767ms 3.6144 KOps/s 3.5349 KOps/s $\color{#35bf28}+2.25\%$
test_mod_wrap[compile-overhead] 7.3685ms 3.7757ms 264.8516 Ops/s 273.1945 Ops/s $\color{#d91a1a}-3.05\%$
test_mod_wrap_and_backward[eager] 1.4185ms 1.3058ms 765.7954 Ops/s 700.7490 Ops/s $\textbf{\color{#35bf28}+9.28\%}$
test_mod_wrap_and_backward[compile] 1.3500ms 1.2473ms 801.7395 Ops/s 793.8073 Ops/s $\color{#35bf28}+1.00\%$
test_mod_wrap_and_backward[compile-overhead] 1.3538ms 0.9209ms 1.0859 KOps/s 1.0796 KOps/s $\color{#35bf28}+0.58\%$
test_seq_add[eager] 0.1774ms 0.1130ms 8.8478 KOps/s 8.2173 KOps/s $\textbf{\color{#35bf28}+7.67\%}$
test_seq_add[compile] 0.1592ms 86.4349μs 11.5694 KOps/s 11.2402 KOps/s $\color{#35bf28}+2.93\%$
test_seq_add[compile-overhead] 0.1668ms 0.1288ms 7.7634 KOps/s 7.7997 KOps/s $\color{#d91a1a}-0.46\%$
test_seq_wrap[eager] 0.4756ms 0.4196ms 2.3830 KOps/s 2.3703 KOps/s $\color{#35bf28}+0.53\%$
test_seq_wrap[compile] 0.3819ms 0.3018ms 3.3131 KOps/s 3.3467 KOps/s $\color{#d91a1a}-1.00\%$
test_seq_wrap[compile-overhead] 0.2715ms 0.2210ms 4.5242 KOps/s 4.4827 KOps/s $\color{#35bf28}+0.93\%$
test_func_call_runtime[False-eager] 0.9344ms 0.7591ms 1.3174 KOps/s 1.4089 KOps/s $\textbf{\color{#d91a1a}-6.49\%}$
test_func_call_runtime[False-compile] 0.8215ms 0.7276ms 1.3743 KOps/s 1.3530 KOps/s $\color{#35bf28}+1.58\%$
test_func_call_runtime[False-compile-overhead] 0.4200ms 0.3591ms 2.7848 KOps/s 2.7764 KOps/s $\color{#35bf28}+0.31\%$
test_func_call_runtime[True-eager] 0.9393ms 0.8614ms 1.1609 KOps/s 1.1210 KOps/s $\color{#35bf28}+3.57\%$
test_func_call_runtime[True-compile] 0.8178ms 0.7590ms 1.3175 KOps/s 1.3213 KOps/s $\color{#d91a1a}-0.29\%$
test_func_call_runtime[True-compile-overhead] 0.4736ms 0.3811ms 2.6243 KOps/s 2.6352 KOps/s $\color{#d91a1a}-0.41\%$
test_func_call_cm_runtime[False-eager] 0.8436ms 0.7429ms 1.3460 KOps/s 1.3988 KOps/s $\color{#d91a1a}-3.77\%$
test_func_call_cm_runtime[False-compile] 1.0780ms 0.7363ms 1.3582 KOps/s 1.3306 KOps/s $\color{#35bf28}+2.07\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4258ms 0.3627ms 2.7574 KOps/s 2.7736 KOps/s $\color{#d91a1a}-0.58\%$
test_func_call_cm_runtime[True-eager] 1.0831ms 0.9622ms 1.0393 KOps/s 1.0177 KOps/s $\color{#35bf28}+2.12\%$
test_func_call_cm_runtime[True-compile] 0.9136ms 0.7786ms 1.2844 KOps/s 1.2662 KOps/s $\color{#35bf28}+1.44\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4597ms 0.4068ms 2.4583 KOps/s 2.4411 KOps/s $\color{#35bf28}+0.71\%$
test_vmap_func_call_cm_runtime[eager] 2.4707ms 2.0003ms 499.9154 Ops/s 492.0025 Ops/s $\color{#35bf28}+1.61\%$
test_vmap_func_call_cm_runtime[compile] 0.9018ms 0.7952ms 1.2575 KOps/s 1.2487 KOps/s $\color{#35bf28}+0.71\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4515ms 0.4075ms 2.4539 KOps/s 2.4356 KOps/s $\color{#35bf28}+0.75\%$
test_distributed 6.5148ms 0.1607ms 6.2230 KOps/s 8.5706 KOps/s $\textbf{\color{#d91a1a}-27.39\%}$
test_tdmodule 72.0410μs 18.3128μs 54.6066 KOps/s 48.1304 KOps/s $\textbf{\color{#35bf28}+13.46\%}$
test_tdmodule_dispatch 54.2810μs 33.6386μs 29.7278 KOps/s 26.7061 KOps/s $\textbf{\color{#35bf28}+11.31\%}$
test_tdseq 39.9900μs 19.9352μs 50.1625 KOps/s 45.4784 KOps/s $\textbf{\color{#35bf28}+10.30\%}$
test_tdseq_dispatch 58.5810μs 36.5098μs 27.3899 KOps/s 24.4418 KOps/s $\textbf{\color{#35bf28}+12.06\%}$
test_instantiation_functorch 1.8502ms 1.5228ms 656.6687 Ops/s 654.5387 Ops/s $\color{#35bf28}+0.33\%$
test_exec_functorch 0.5768ms 0.1436ms 6.9631 KOps/s 6.9628 KOps/s $+0.00\%$
test_exec_functional_call 0.1777ms 0.1334ms 7.4975 KOps/s 7.7472 KOps/s $\color{#d91a1a}-3.22\%$
test_exec_td_decorator 0.5926ms 0.1811ms 5.5231 KOps/s 5.6462 KOps/s $\color{#d91a1a}-2.18\%$
test_vmap_mlp_speed_decorator[True-True] 1.0924ms 0.6642ms 1.5055 KOps/s 1.4951 KOps/s $\color{#35bf28}+0.70\%$
test_vmap_mlp_speed_decorator[True-False] 1.0549ms 0.6581ms 1.5195 KOps/s 1.4957 KOps/s $\color{#35bf28}+1.59\%$
test_vmap_mlp_speed_decorator[False-True] 0.9990ms 0.5974ms 1.6739 KOps/s 1.7441 KOps/s $\color{#d91a1a}-4.02\%$
test_vmap_mlp_speed_decorator[False-False] 0.9940ms 0.5975ms 1.6738 KOps/s 1.7333 KOps/s $\color{#d91a1a}-3.43\%$
test_vmap_transformer_speed_decorator[True-True] 19.1701ms 18.5447ms 53.9237 Ops/s 53.9059 Ops/s $\color{#35bf28}+0.03\%$
test_vmap_transformer_speed_decorator[True-False] 19.2156ms 18.5828ms 53.8133 Ops/s 53.7399 Ops/s $\color{#35bf28}+0.14\%$
test_vmap_transformer_speed_decorator[False-True] 19.0022ms 18.4290ms 54.2622 Ops/s 54.2508 Ops/s $\color{#35bf28}+0.02\%$
test_vmap_transformer_speed_decorator[False-False] 19.4612ms 18.4217ms 54.2839 Ops/s 54.1243 Ops/s $\color{#35bf28}+0.30\%$
test_to_module_speed[True] 1.4318ms 0.9759ms 1.0247 KOps/s 1.0315 KOps/s $\color{#d91a1a}-0.66\%$
test_to_module_speed[False] 1.3527ms 0.9545ms 1.0476 KOps/s 1.0378 KOps/s $\color{#35bf28}+0.95\%$
test_tc_init 66.4300μs 37.2552μs 26.8419 KOps/s 25.2303 KOps/s $\textbf{\color{#35bf28}+6.39\%}$
test_tc_init_nested 0.2195ms 75.9624μs 13.1644 KOps/s 12.0801 KOps/s $\textbf{\color{#35bf28}+8.98\%}$
test_tc_first_layer_tensor 54.5050μs 0.6952μs 1.4384 MOps/s 1.4140 MOps/s $\color{#35bf28}+1.73\%$
test_tc_first_layer_nontensor 24.0210μs 2.3328μs 428.6686 KOps/s 431.2735 KOps/s $\color{#d91a1a}-0.60\%$
test_tc_second_layer_tensor 96.1088μs 1.4068μs 710.8156 KOps/s 716.6233 KOps/s $\color{#d91a1a}-0.81\%$
test_tc_second_layer_nontensor 26.0210μs 3.0323μs 329.7790 KOps/s 329.4427 KOps/s $\color{#35bf28}+0.10\%$
test_unbind 0.2202s 10.1671ms 98.3568 Ops/s 144.6915 Ops/s $\textbf{\color{#d91a1a}-32.02\%}$
test_full_like 9.8872ms 9.2192ms 108.4694 Ops/s 106.3454 Ops/s $\color{#35bf28}+2.00\%$
test_zeros_like 5.2110ms 4.3129ms 231.8607 Ops/s 115.2723 Ops/s $\textbf{\color{#35bf28}+101.14\%}$
test_ones_like 9.3240ms 7.1934ms 139.0157 Ops/s 232.0149 Ops/s $\textbf{\color{#d91a1a}-40.08\%}$
test_clone 11.6714ms 9.0661ms 110.3010 Ops/s 158.6598 Ops/s $\textbf{\color{#d91a1a}-30.48\%}$
test_squeeze 60.4110μs 9.3888μs 106.5095 KOps/s 107.1121 KOps/s $\color{#d91a1a}-0.56\%$
test_unsqueeze 0.1187ms 69.9956μs 14.2866 KOps/s 14.0659 KOps/s $\color{#35bf28}+1.57\%$
test_split 0.5462ms 0.1538ms 6.5011 KOps/s 6.3491 KOps/s $\color{#35bf28}+2.39\%$
test_permute 0.3162ms 0.1780ms 5.6179 KOps/s 5.5258 KOps/s $\color{#35bf28}+1.67\%$
test_stack 51.1102ms 50.8321ms 19.6726 Ops/s 18.8417 Ops/s $\color{#35bf28}+4.41\%$
test_cat 50.9168ms 50.6069ms 19.7601 Ops/s 19.0072 Ops/s $\color{#35bf28}+3.96\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 18, 2024
ghstack-source-id: 9f27d6b67f7b0946f70d12efcb677e6139bd1ec1
Pull Request resolved: #1144
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants