Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Better return_log_prob=True for tensordict outputs #1155

Open
wants to merge 1 commit into
base: gh/vmoens/41/base
Choose a base branch
from

Conversation

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 20, 2024
ghstack-source-id: 977af3880f39cb341c1c715f1b8c9d59b7c580a0
Pull Request resolved: #1155
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}32$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 47.9100μs 21.3336μs 46.8744 KOps/s 49.8174 KOps/s $\textbf{\color{#d91a1a}-5.91\%}$
test_plain_set_stack_nested 48.1600μs 21.6184μs 46.2569 KOps/s 48.9276 KOps/s $\textbf{\color{#d91a1a}-5.46\%}$
test_plain_set_nested_inplace 87.6740μs 23.4900μs 42.5713 KOps/s 45.1825 KOps/s $\textbf{\color{#d91a1a}-5.78\%}$
test_plain_set_stack_nested_inplace 74.2980μs 23.4132μs 42.7110 KOps/s 45.3554 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_items 24.2350μs 4.1159μs 242.9601 KOps/s 239.2826 KOps/s $\color{#35bf28}+1.54\%$
test_items_nested 0.6209ms 0.4082ms 2.4499 KOps/s 2.4562 KOps/s $\color{#d91a1a}-0.26\%$
test_items_nested_locked 0.6390ms 0.4090ms 2.4448 KOps/s 2.4456 KOps/s $\color{#d91a1a}-0.03\%$
test_items_nested_leaf 0.2281ms 77.7930μs 12.8546 KOps/s 12.9459 KOps/s $\color{#d91a1a}-0.70\%$
test_items_stack_nested 0.7763ms 0.4124ms 2.4248 KOps/s 2.4367 KOps/s $\color{#d91a1a}-0.49\%$
test_items_stack_nested_leaf 0.1415ms 80.8882μs 12.3627 KOps/s 12.5679 KOps/s $\color{#d91a1a}-1.63\%$
test_items_stack_nested_locked 0.6232ms 0.4118ms 2.4282 KOps/s 2.4042 KOps/s $\color{#35bf28}+1.00\%$
test_keys 25.8080μs 3.5065μs 285.1841 KOps/s 281.6261 KOps/s $\color{#35bf28}+1.26\%$
test_keys_nested 0.2733ms 0.1645ms 6.0806 KOps/s 6.0470 KOps/s $\color{#35bf28}+0.56\%$
test_keys_nested_locked 1.9496ms 0.1713ms 5.8371 KOps/s 5.7983 KOps/s $\color{#35bf28}+0.67\%$
test_keys_nested_leaf 0.2500ms 0.1439ms 6.9488 KOps/s 6.8634 KOps/s $\color{#35bf28}+1.24\%$
test_keys_stack_nested 0.3325ms 0.1621ms 6.1689 KOps/s 6.0532 KOps/s $\color{#35bf28}+1.91\%$
test_keys_stack_nested_leaf 0.2424ms 0.1388ms 7.2057 KOps/s 6.9218 KOps/s $\color{#35bf28}+4.10\%$
test_keys_stack_nested_locked 0.3223ms 0.1678ms 5.9588 KOps/s 5.8501 KOps/s $\color{#35bf28}+1.86\%$
test_values 34.1736μs 1.0030μs 996.9813 KOps/s 958.6461 KOps/s $\color{#35bf28}+4.00\%$
test_values_nested 0.1919ms 64.4291μs 15.5209 KOps/s 15.9146 KOps/s $\color{#d91a1a}-2.47\%$
test_values_nested_locked 0.2227ms 64.2160μs 15.5724 KOps/s 15.6650 KOps/s $\color{#d91a1a}-0.59\%$
test_values_nested_leaf 0.1267ms 72.9552μs 13.7070 KOps/s 13.1228 KOps/s $\color{#35bf28}+4.45\%$
test_values_stack_nested 0.2030ms 64.1536μs 15.5876 KOps/s 15.7727 KOps/s $\color{#d91a1a}-1.17\%$
test_values_stack_nested_leaf 0.2850ms 72.2023μs 13.8500 KOps/s 13.8607 KOps/s $\color{#d91a1a}-0.08\%$
test_values_stack_nested_locked 0.2349ms 63.4035μs 15.7720 KOps/s 15.4258 KOps/s $\color{#35bf28}+2.24\%$
test_membership 7.1790μs 0.7534μs 1.3274 MOps/s 1.3207 MOps/s $\color{#35bf28}+0.51\%$
test_membership_nested 20.9990μs 2.9734μs 336.3114 KOps/s 338.5047 KOps/s $\color{#d91a1a}-0.65\%$
test_membership_nested_leaf 78.7570μs 3.0895μs 323.6740 KOps/s 336.0800 KOps/s $\color{#d91a1a}-3.69\%$
test_membership_stacked_nested 23.4030μs 2.9838μs 335.1478 KOps/s 334.7820 KOps/s $\color{#35bf28}+0.11\%$
test_membership_stacked_nested_leaf 33.4020μs 2.9868μs 334.8012 KOps/s 332.8831 KOps/s $\color{#35bf28}+0.58\%$
test_membership_nested_last 48.8310μs 4.5324μs 220.6316 KOps/s 223.4093 KOps/s $\color{#d91a1a}-1.24\%$
test_membership_nested_leaf_last 62.0960μs 4.5408μs 220.2273 KOps/s 223.7835 KOps/s $\color{#d91a1a}-1.59\%$
test_membership_stacked_nested_last 47.9890μs 13.8638μs 72.1301 KOps/s 227.8883 KOps/s $\textbf{\color{#d91a1a}-68.35\%}$
test_membership_stacked_nested_leaf_last 63.3880μs 14.0519μs 71.1645 KOps/s 221.8625 KOps/s $\textbf{\color{#d91a1a}-67.92\%}$
test_nested_getleaf 61.0540μs 11.0283μs 90.6760 KOps/s 94.3081 KOps/s $\color{#d91a1a}-3.85\%$
test_nested_get 36.9790μs 10.4804μs 95.4167 KOps/s 98.3607 KOps/s $\color{#d91a1a}-2.99\%$
test_stacked_getleaf 34.3740μs 10.8164μs 92.4520 KOps/s 93.4992 KOps/s $\color{#d91a1a}-1.12\%$
test_stacked_get 60.0420μs 10.3701μs 96.4308 KOps/s 97.2415 KOps/s $\color{#d91a1a}-0.83\%$
test_nested_getitemleaf 54.5310μs 11.3578μs 88.0452 KOps/s 87.8055 KOps/s $\color{#35bf28}+0.27\%$
test_nested_getitem 68.0870μs 11.0154μs 90.7816 KOps/s 94.7853 KOps/s $\color{#d91a1a}-4.22\%$
test_stacked_getitemleaf 67.9570μs 10.9995μs 90.9132 KOps/s 89.5598 KOps/s $\color{#35bf28}+1.51\%$
test_stacked_getitem 33.7830μs 10.4733μs 95.4811 KOps/s 96.6967 KOps/s $\color{#d91a1a}-1.26\%$
test_lock_nested 4.7230ms 0.4889ms 2.0453 KOps/s 2.1790 KOps/s $\textbf{\color{#d91a1a}-6.14\%}$
test_lock_stack_nested 0.8050ms 0.4351ms 2.2982 KOps/s 2.3371 KOps/s $\color{#d91a1a}-1.66\%$
test_unlock_nested 0.8165ms 0.3864ms 2.5882 KOps/s 2.6034 KOps/s $\color{#d91a1a}-0.58\%$
test_unlock_stack_nested 0.4324ms 0.3423ms 2.9216 KOps/s 2.8507 KOps/s $\color{#35bf28}+2.49\%$
test_flatten_speed 0.1689ms 0.1005ms 9.9455 KOps/s 10.0334 KOps/s $\color{#d91a1a}-0.88\%$
test_unflatten_speed 0.7572ms 0.5475ms 1.8265 KOps/s 1.8511 KOps/s $\color{#d91a1a}-1.33\%$
test_common_ops 4.2833ms 0.8302ms 1.2045 KOps/s 1.2853 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_creation 82.0730μs 2.5094μs 398.5000 KOps/s 396.7859 KOps/s $\color{#35bf28}+0.43\%$
test_creation_empty 75.1910μs 12.9991μs 76.9285 KOps/s 96.3710 KOps/s $\textbf{\color{#d91a1a}-20.17\%}$
test_creation_nested_1 48.4910μs 16.0319μs 62.3756 KOps/s 74.1029 KOps/s $\textbf{\color{#d91a1a}-15.83\%}$
test_creation_nested_2 51.9570μs 20.8550μs 47.9502 KOps/s 54.9910 KOps/s $\textbf{\color{#d91a1a}-12.80\%}$
test_clone 86.8620μs 13.8040μs 72.4425 KOps/s 71.8883 KOps/s $\color{#35bf28}+0.77\%$
test_getitem[int] 0.7822ms 13.2255μs 75.6115 KOps/s 77.2795 KOps/s $\color{#d91a1a}-2.16\%$
test_getitem[slice_int] 0.1406ms 25.3576μs 39.4360 KOps/s 39.1484 KOps/s $\color{#35bf28}+0.73\%$
test_getitem[range] 0.2562ms 50.1584μs 19.9369 KOps/s 21.1134 KOps/s $\textbf{\color{#d91a1a}-5.57\%}$
test_getitem[tuple] 0.1650ms 21.0410μs 47.5262 KOps/s 48.4023 KOps/s $\color{#d91a1a}-1.81\%$
test_getitem[list] 0.3415ms 45.4734μs 21.9909 KOps/s 23.6574 KOps/s $\textbf{\color{#d91a1a}-7.04\%}$
test_setitem_dim[int] 64.4600μs 27.0442μs 36.9766 KOps/s 40.7396 KOps/s $\textbf{\color{#d91a1a}-9.24\%}$
test_setitem_dim[slice_int] 0.1022ms 54.8198μs 18.2416 KOps/s 19.8438 KOps/s $\textbf{\color{#d91a1a}-8.07\%}$
test_setitem_dim[range] 0.1224ms 76.6453μs 13.0471 KOps/s 14.1630 KOps/s $\textbf{\color{#d91a1a}-7.88\%}$
test_setitem_dim[tuple] 92.0320μs 41.4745μs 24.1112 KOps/s 25.0439 KOps/s $\color{#d91a1a}-3.72\%$
test_setitem 96.9110μs 21.9904μs 45.4745 KOps/s 48.3552 KOps/s $\textbf{\color{#d91a1a}-5.96\%}$
test_set 75.6310μs 21.4016μs 46.7256 KOps/s 49.2171 KOps/s $\textbf{\color{#d91a1a}-5.06\%}$
test_set_shared 1.1792ms 0.1701ms 5.8774 KOps/s 5.9126 KOps/s $\color{#d91a1a}-0.60\%$
test_update 0.1715ms 25.0114μs 39.9817 KOps/s 44.4039 KOps/s $\textbf{\color{#d91a1a}-9.96\%}$
test_update_nested 0.1510ms 35.3231μs 28.3101 KOps/s 30.1834 KOps/s $\textbf{\color{#d91a1a}-6.21\%}$
test_update__nested 0.2859ms 35.7333μs 27.9851 KOps/s 28.4643 KOps/s $\color{#d91a1a}-1.68\%$
test_set_nested 95.5080μs 23.8901μs 41.8583 KOps/s 43.7146 KOps/s $\color{#d91a1a}-4.25\%$
test_set_nested_new 0.1319ms 28.8756μs 34.6313 KOps/s 36.5333 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_select 0.1969ms 46.1876μs 21.6508 KOps/s 22.9625 KOps/s $\textbf{\color{#d91a1a}-5.71\%}$
test_select_nested 0.1578ms 64.5282μs 15.4971 KOps/s 15.7265 KOps/s $\color{#d91a1a}-1.46\%$
test_exclude_nested 0.1785ms 84.9301μs 11.7744 KOps/s 12.1448 KOps/s $\color{#d91a1a}-3.05\%$
test_empty[True] 0.6184ms 0.4193ms 2.3847 KOps/s 2.4424 KOps/s $\color{#d91a1a}-2.36\%$
test_empty[False] 13.8108μs 1.3852μs 721.9035 KOps/s 735.0361 KOps/s $\color{#d91a1a}-1.79\%$
test_unbind_speed 0.4127ms 0.2807ms 3.5627 KOps/s 3.6925 KOps/s $\color{#d91a1a}-3.51\%$
test_unbind_speed_stack0 0.4033ms 0.2713ms 3.6854 KOps/s 3.7326 KOps/s $\color{#d91a1a}-1.26\%$
test_unbind_speed_stack1 0.1125s 0.8352ms 1.1973 KOps/s 1.5069 KOps/s $\textbf{\color{#d91a1a}-20.55\%}$
test_split 0.1110s 1.8266ms 547.4759 Ops/s 558.8898 Ops/s $\color{#d91a1a}-2.04\%$
test_chunk 1.8005ms 1.6494ms 606.2857 Ops/s 560.3434 Ops/s $\textbf{\color{#35bf28}+8.20\%}$
test_consolidate_njt[False-None] 0.1173s 9.0842ms 110.0811 Ops/s 122.4392 Ops/s $\textbf{\color{#d91a1a}-10.09\%}$
test_creation[device0] 0.2767ms 89.7701μs 11.1396 KOps/s 10.9753 KOps/s $\color{#35bf28}+1.50\%$
test_creation_from_tensor 3.4384ms 94.4030μs 10.5929 KOps/s 10.5627 KOps/s $\color{#35bf28}+0.29\%$
test_add_one[memmap_tensor0] 0.2030ms 4.8598μs 205.7685 KOps/s 207.0374 KOps/s $\color{#d91a1a}-0.61\%$
test_contiguous[memmap_tensor0] 26.0590μs 0.5053μs 1.9790 MOps/s 1.9411 MOps/s $\color{#35bf28}+1.96\%$
test_stack[memmap_tensor0] 40.9960μs 3.4517μs 289.7125 KOps/s 282.8126 KOps/s $\color{#35bf28}+2.44\%$
test_memmaptd_index 0.4909ms 0.2431ms 4.1128 KOps/s 4.0676 KOps/s $\color{#35bf28}+1.11\%$
test_memmaptd_index_astensor 0.6717ms 0.3307ms 3.0239 KOps/s 2.9756 KOps/s $\color{#35bf28}+1.62\%$
test_memmaptd_index_op 1.0369ms 0.6089ms 1.6423 KOps/s 1.7016 KOps/s $\color{#d91a1a}-3.48\%$
test_serialize_model 0.1245s 0.1138s 8.7866 Ops/s 7.2741 Ops/s $\textbf{\color{#35bf28}+20.79\%}$
test_serialize_model_pickle 0.4605s 0.3902s 2.5626 Ops/s 2.4753 Ops/s $\color{#35bf28}+3.53\%$
test_serialize_weights 0.1271s 0.1170s 8.5467 Ops/s 8.7139 Ops/s $\color{#d91a1a}-1.92\%$
test_serialize_weights_returnearly 0.1603s 0.1541s 6.4905 Ops/s 6.3551 Ops/s $\color{#35bf28}+2.13\%$
test_serialize_weights_pickle 0.4819s 0.4093s 2.4432 Ops/s 2.5119 Ops/s $\color{#d91a1a}-2.73\%$
test_serialize_weights_filesystem 0.1541s 0.1410s 7.0920 Ops/s 6.2244 Ops/s $\textbf{\color{#35bf28}+13.94\%}$
test_serialize_model_filesystem 0.1570s 0.1501s 6.6643 Ops/s 6.6442 Ops/s $\color{#35bf28}+0.30\%$
test_reshape_pytree 80.9110μs 27.2338μs 36.7191 KOps/s 36.6552 KOps/s $\color{#35bf28}+0.17\%$
test_reshape_td 0.1286ms 34.6806μs 28.8346 KOps/s 30.1133 KOps/s $\color{#d91a1a}-4.25\%$
test_view_pytree 0.1182ms 27.2903μs 36.6430 KOps/s 36.6567 KOps/s $\color{#d91a1a}-0.04\%$
test_view_td 85.6590μs 39.2746μs 25.4618 KOps/s 25.8181 KOps/s $\color{#d91a1a}-1.38\%$
test_unbind_pytree 0.1183ms 29.9681μs 33.3688 KOps/s 33.2536 KOps/s $\color{#35bf28}+0.35\%$
test_unbind_td 0.3528ms 41.9076μs 23.8620 KOps/s 25.2415 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_split_pytree 64.6210μs 30.0861μs 33.2380 KOps/s 32.4036 KOps/s $\color{#35bf28}+2.57\%$
test_split_td 0.5649ms 46.7654μs 21.3833 KOps/s 21.9048 KOps/s $\color{#d91a1a}-2.38\%$
test_add_pytree 90.1880μs 35.7805μs 27.9482 KOps/s 27.0208 KOps/s $\color{#35bf28}+3.43\%$
test_add_td 0.1527ms 59.2009μs 16.8916 KOps/s 16.9197 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_add_one_nested[tensordict-compile] 0.2088ms 63.7769μs 15.6797 KOps/s 16.2757 KOps/s $\color{#d91a1a}-3.66\%$
test_compile_add_one_nested[tensordict-eager] 0.4028ms 0.1699ms 5.8848 KOps/s 5.9393 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_add_one_nested[pytree-compile] 0.1411ms 47.3808μs 21.1056 KOps/s 22.4721 KOps/s $\textbf{\color{#d91a1a}-6.08\%}$
test_compile_add_one_nested[pytree-eager] 0.2646ms 0.1175ms 8.5097 KOps/s 8.3062 KOps/s $\color{#35bf28}+2.45\%$
test_compile_copy_nested[tensordict-compile] 80.2790μs 27.0957μs 36.9063 KOps/s 38.5984 KOps/s $\color{#d91a1a}-4.38\%$
test_compile_copy_nested[tensordict-eager] 0.1432ms 60.4009μs 16.5560 KOps/s 16.9899 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_copy_nested[pytree-compile] 0.1762ms 80.4950μs 12.4231 KOps/s 12.4327 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_copy_nested[pytree-eager] 0.1718ms 68.3551μs 14.6295 KOps/s 14.4715 KOps/s $\color{#35bf28}+1.09\%$
test_compile_add_one_flat[tensordict-compile] 0.1878ms 0.1046ms 9.5558 KOps/s 9.5503 KOps/s $\color{#35bf28}+0.06\%$
test_compile_add_one_flat[tensordict-eager] 0.4427ms 0.2135ms 4.6844 KOps/s 4.6548 KOps/s $\color{#35bf28}+0.64\%$
test_compile_add_one_flat[tensorclass-compile] 0.1083ms 44.9149μs 22.2643 KOps/s 22.7140 KOps/s $\color{#d91a1a}-1.98\%$
test_compile_add_one_flat[tensorclass-eager] 0.5041ms 65.0185μs 15.3802 KOps/s 15.7030 KOps/s $\color{#d91a1a}-2.06\%$
test_compile_add_one_flat[pytree-compile] 0.2447ms 0.1041ms 9.6083 KOps/s 9.8441 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_add_one_flat[pytree-eager] 0.3247ms 0.2019ms 4.9538 KOps/s 4.9472 KOps/s $\color{#35bf28}+0.13\%$
test_compile_add_self_flat[tensordict-eager] 0.4169ms 0.2325ms 4.3009 KOps/s 4.3000 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_self_flat[tensordict-compile] 0.3041ms 0.1052ms 9.5063 KOps/s 9.5996 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_add_self_flat[tensorclass-eager] 0.1902ms 59.9669μs 16.6759 KOps/s 17.0066 KOps/s $\color{#d91a1a}-1.94\%$
test_compile_add_self_flat[tensorclass-compile] 0.1128ms 45.7322μs 21.8664 KOps/s 22.8252 KOps/s $\color{#d91a1a}-4.20\%$
test_compile_add_self_flat[pytree-eager] 0.6301ms 0.1620ms 6.1717 KOps/s 6.2587 KOps/s $\color{#d91a1a}-1.39\%$
test_compile_add_self_flat[pytree-compile] 0.2298ms 0.1031ms 9.6988 KOps/s 9.6137 KOps/s $\color{#35bf28}+0.89\%$
test_compile_copy_flat[tensordict-compile] 66.9150μs 22.2955μs 44.8520 KOps/s 47.5053 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_compile_copy_flat[tensordict-eager] 0.2043ms 68.8575μs 14.5227 KOps/s 14.9915 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_copy_flat[pytree-compile] 0.1957ms 81.6787μs 12.2431 KOps/s 12.0410 KOps/s $\color{#35bf28}+1.68\%$
test_compile_copy_flat[pytree-eager] 0.1370ms 69.4645μs 14.3958 KOps/s 14.2969 KOps/s $\color{#35bf28}+0.69\%$
test_compile_assign_and_add[tensordict-compile] 0.2883ms 0.2087ms 4.7925 KOps/s 4.8244 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_assign_and_add[tensordict-eager] 2.0821ms 1.3550ms 738.0037 Ops/s 750.6545 Ops/s $\color{#d91a1a}-1.69\%$
test_compile_assign_and_add[pytree-compile] 0.2891ms 0.2058ms 4.8591 KOps/s 4.9480 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_assign_and_add[pytree-eager] 1.6502ms 0.7766ms 1.2876 KOps/s 1.2715 KOps/s $\color{#35bf28}+1.27\%$
test_compile_assign_and_add_stack[compile] 0.5627ms 0.4579ms 2.1837 KOps/s 2.1837 KOps/s $+0.00\%$
test_compile_assign_and_add_stack[eager] 4.2526ms 2.7894ms 358.5015 Ops/s 378.2050 Ops/s $\textbf{\color{#d91a1a}-5.21\%}$
test_compile_indexing[tensor-tensordict-compile] 98.8440μs 36.3039μs 27.5453 KOps/s 28.7561 KOps/s $\color{#d91a1a}-4.21\%$
test_compile_indexing[tensor-tensordict-eager] 0.6002ms 34.9605μs 28.6037 KOps/s 30.8598 KOps/s $\textbf{\color{#d91a1a}-7.31\%}$
test_compile_indexing[tensor-tensorclass-compile] 78.1850μs 30.1913μs 33.1222 KOps/s 34.0076 KOps/s $\color{#d91a1a}-2.60\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1030ms 24.1373μs 41.4296 KOps/s 41.6509 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_indexing[tensor-pytree-compile] 90.9490μs 30.2183μs 33.0926 KOps/s 34.0837 KOps/s $\color{#d91a1a}-2.91\%$
test_compile_indexing[tensor-pytree-eager] 96.2210μs 23.9025μs 41.8366 KOps/s 42.3447 KOps/s $\color{#d91a1a}-1.20\%$
test_compile_indexing[slice-tensordict-compile] 0.1352ms 51.2493μs 19.5124 KOps/s 19.8167 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_indexing[slice-tensordict-eager] 0.6104ms 21.0151μs 47.5848 KOps/s 48.7159 KOps/s $\color{#d91a1a}-2.32\%$
test_compile_indexing[slice-tensorclass-compile] 93.9350μs 44.4189μs 22.5129 KOps/s 22.6950 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_indexing[slice-tensorclass-eager] 59.3410μs 19.1258μs 52.2855 KOps/s 50.7186 KOps/s $\color{#35bf28}+3.09\%$
test_compile_indexing[slice-pytree-compile] 0.1459ms 45.5856μs 21.9367 KOps/s 22.1669 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_indexing[slice-pytree-eager] 0.1235ms 18.7764μs 53.2582 KOps/s 51.1069 KOps/s $\color{#35bf28}+4.21\%$
test_compile_indexing[int-tensordict-compile] 0.1045ms 51.8588μs 19.2831 KOps/s 19.1488 KOps/s $\color{#35bf28}+0.70\%$
test_compile_indexing[int-tensordict-eager] 0.9769ms 21.1071μs 47.3774 KOps/s 48.8763 KOps/s $\color{#d91a1a}-3.07\%$
test_compile_indexing[int-tensorclass-compile] 0.1126ms 44.8966μs 22.2734 KOps/s 22.4658 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_indexing[int-tensorclass-eager] 85.4990μs 19.2930μs 51.8323 KOps/s 50.9350 KOps/s $\color{#35bf28}+1.76\%$
test_compile_indexing[int-pytree-compile] 0.1315ms 45.1238μs 22.1612 KOps/s 22.2023 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_indexing[int-pytree-eager] 0.1073ms 20.0607μs 49.8488 KOps/s 52.1319 KOps/s $\color{#d91a1a}-4.38\%$
test_mod_add[eager] 80.2490μs 34.0061μs 29.4065 KOps/s 28.3726 KOps/s $\color{#35bf28}+3.64\%$
test_mod_add[compile] 0.1449ms 48.3903μs 20.6653 KOps/s 20.8333 KOps/s $\color{#d91a1a}-0.81\%$
test_mod_add[compile-overhead] 0.1086ms 46.0088μs 21.7350 KOps/s 20.9183 KOps/s $\color{#35bf28}+3.90\%$
test_mod_wrap[eager] 0.4541ms 0.2186ms 4.5750 KOps/s 4.4978 KOps/s $\color{#35bf28}+1.72\%$
test_mod_wrap[compile] 0.4008ms 0.2051ms 4.8756 KOps/s 4.7680 KOps/s $\color{#35bf28}+2.26\%$
test_mod_wrap[compile-overhead] 0.3149ms 0.2027ms 4.9331 KOps/s 4.7735 KOps/s $\color{#35bf28}+3.34\%$
test_mod_wrap_and_backward[eager] 12.3410ms 11.1016ms 90.0773 Ops/s 81.6655 Ops/s $\textbf{\color{#35bf28}+10.30\%}$
test_mod_wrap_and_backward[compile] 17.8757ms 12.2327ms 81.7478 Ops/s 74.2310 Ops/s $\textbf{\color{#35bf28}+10.13\%}$
test_mod_wrap_and_backward[compile-overhead] 15.1381ms 12.7876ms 78.2006 Ops/s 74.3027 Ops/s $\textbf{\color{#35bf28}+5.25\%}$
test_seq_add[eager] 0.2917ms 0.1164ms 8.5883 KOps/s 8.5504 KOps/s $\color{#35bf28}+0.44\%$
test_seq_add[compile] 0.1176ms 60.3171μs 16.5791 KOps/s 16.0939 KOps/s $\color{#35bf28}+3.01\%$
test_seq_add[compile-overhead] 0.1245ms 58.9076μs 16.9757 KOps/s 16.5166 KOps/s $\color{#35bf28}+2.78\%$
test_seq_wrap[eager] 0.7390ms 0.4405ms 2.2701 KOps/s 2.2300 KOps/s $\color{#35bf28}+1.79\%$
test_seq_wrap[compile] 0.3427ms 0.2269ms 4.4078 KOps/s 4.3115 KOps/s $\color{#35bf28}+2.23\%$
test_seq_wrap[compile-overhead] 0.4213ms 0.2231ms 4.4826 KOps/s 4.3108 KOps/s $\color{#35bf28}+3.98\%$
test_func_call_runtime[False-eager] 0.6342ms 0.5344ms 1.8711 KOps/s 1.7783 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_func_call_runtime[False-compile] 0.5785ms 0.4266ms 2.3442 KOps/s 2.3244 KOps/s $\color{#35bf28}+0.85\%$
test_func_call_runtime[False-compile-overhead] 0.7918ms 0.4306ms 2.3225 KOps/s 2.3119 KOps/s $\color{#35bf28}+0.46\%$
test_func_call_runtime[True-eager] 0.9442ms 0.7599ms 1.3159 KOps/s 1.2968 KOps/s $\color{#35bf28}+1.47\%$
test_func_call_runtime[True-compile] 0.8340ms 0.4684ms 2.1351 KOps/s 2.1127 KOps/s $\color{#35bf28}+1.06\%$
test_func_call_runtime[True-compile-overhead] 0.5572ms 0.4656ms 2.1479 KOps/s 2.1232 KOps/s $\color{#35bf28}+1.16\%$
test_func_call_cm_runtime[False-eager] 0.6396ms 0.5453ms 1.8339 KOps/s 1.8133 KOps/s $\color{#35bf28}+1.13\%$
test_func_call_cm_runtime[False-compile] 0.5977ms 0.4275ms 2.3393 KOps/s 2.3315 KOps/s $\color{#35bf28}+0.33\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5473ms 0.4263ms 2.3460 KOps/s 2.3358 KOps/s $\color{#35bf28}+0.44\%$
test_func_call_cm_runtime[True-eager] 1.3126ms 0.9073ms 1.1022 KOps/s 1.0838 KOps/s $\color{#35bf28}+1.70\%$
test_func_call_cm_runtime[True-compile] 0.6414ms 0.4913ms 2.0355 KOps/s 2.0086 KOps/s $\color{#35bf28}+1.34\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8491ms 0.4923ms 2.0312 KOps/s 2.0217 KOps/s $\color{#35bf28}+0.47\%$
test_vmap_func_call_cm_runtime[eager] 2.3166ms 1.8789ms 532.2289 Ops/s 524.6597 Ops/s $\color{#35bf28}+1.44\%$
test_vmap_func_call_cm_runtime[compile] 0.8042ms 0.5127ms 1.9504 KOps/s 1.9231 KOps/s $\color{#35bf28}+1.42\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8691ms 0.5142ms 1.9448 KOps/s 1.9072 KOps/s $\color{#35bf28}+1.97\%$
test_distributed 0.5259ms 0.1233ms 8.1127 KOps/s 7.7868 KOps/s $\color{#35bf28}+4.19\%$
test_tdmodule 85.2290μs 27.5624μs 36.2813 KOps/s 38.8269 KOps/s $\textbf{\color{#d91a1a}-6.56\%}$
test_tdmodule_dispatch 73.0160μs 48.9408μs 20.4329 KOps/s 21.2616 KOps/s $\color{#d91a1a}-3.90\%$
test_tdseq 59.6520μs 29.1530μs 34.3018 KOps/s 35.5446 KOps/s $\color{#d91a1a}-3.50\%$
test_tdseq_dispatch 79.1180μs 54.8063μs 18.2461 KOps/s 18.8018 KOps/s $\color{#d91a1a}-2.96\%$
test_instantiation_functorch 2.4621ms 1.5249ms 655.7642 Ops/s 647.9172 Ops/s $\color{#35bf28}+1.21\%$
test_exec_functorch 0.2748ms 0.1775ms 5.6328 KOps/s 5.5382 KOps/s $\color{#35bf28}+1.71\%$
test_exec_functional_call 0.4291ms 0.1709ms 5.8513 KOps/s 5.8150 KOps/s $\color{#35bf28}+0.62\%$
test_exec_td_decorator 0.4909ms 0.2356ms 4.2446 KOps/s 4.1877 KOps/s $\color{#35bf28}+1.36\%$
test_vmap_mlp_speed_decorator[True-True] 0.9274ms 0.6505ms 1.5372 KOps/s 1.5464 KOps/s $\color{#d91a1a}-0.60\%$
test_vmap_mlp_speed_decorator[True-False] 0.8939ms 0.6451ms 1.5500 KOps/s 1.5500 KOps/s $-0.00\%$
test_vmap_mlp_speed_decorator[False-True] 0.8327ms 0.5216ms 1.9173 KOps/s 1.8922 KOps/s $\color{#35bf28}+1.33\%$
test_vmap_mlp_speed_decorator[False-False] 1.0224ms 0.5213ms 1.9181 KOps/s 1.8894 KOps/s $\color{#35bf28}+1.52\%$
test_to_module_speed[True] 2.6302ms 1.3573ms 736.7480 Ops/s 724.8368 Ops/s $\color{#35bf28}+1.64\%$
test_to_module_speed[False] 2.0140ms 1.3144ms 760.8218 Ops/s 732.4607 Ops/s $\color{#35bf28}+3.87\%$
test_tc_init 0.1100ms 50.1850μs 19.9263 KOps/s 21.4423 KOps/s $\textbf{\color{#d91a1a}-7.07\%}$
test_tc_init_nested 0.2599ms 0.1009ms 9.9060 KOps/s 10.8306 KOps/s $\textbf{\color{#d91a1a}-8.54\%}$
test_tc_first_layer_tensor 19.8470μs 1.5040μs 664.9075 KOps/s 663.0554 KOps/s $\color{#35bf28}+0.28\%$
test_tc_first_layer_nontensor 28.0220μs 4.8059μs 208.0755 KOps/s 209.5486 KOps/s $\color{#d91a1a}-0.70\%$
test_tc_second_layer_tensor 18.6540μs 2.8050μs 356.5031 KOps/s 357.9354 KOps/s $\color{#d91a1a}-0.40\%$
test_tc_second_layer_nontensor 27.3010μs 6.1064μs 163.7626 KOps/s 165.5080 KOps/s $\color{#d91a1a}-1.05\%$
test_unbind 0.2141s 13.2128ms 75.6841 Ops/s 75.6499 Ops/s $\color{#35bf28}+0.05\%$
test_full_like 18.2650ms 12.6459ms 79.0769 Ops/s 76.8459 Ops/s $\color{#35bf28}+2.90\%$
test_zeros_like 13.4227ms 7.4945ms 133.4305 Ops/s 130.7429 Ops/s $\color{#35bf28}+2.06\%$
test_ones_like 15.6661ms 8.0093ms 124.8544 Ops/s 124.6271 Ops/s $\color{#35bf28}+0.18\%$
test_clone 13.7768ms 9.3843ms 106.5612 Ops/s 98.9863 Ops/s $\textbf{\color{#35bf28}+7.65\%}$
test_squeeze 64.6800μs 12.6275μs 79.1925 KOps/s 79.5629 KOps/s $\color{#d91a1a}-0.47\%$
test_unsqueeze 0.1846ms 93.7582μs 10.6657 KOps/s 10.8979 KOps/s $\color{#d91a1a}-2.13\%$
test_split 0.4693ms 0.1993ms 5.0167 KOps/s 5.0773 KOps/s $\color{#d91a1a}-1.19\%$
test_permute 0.3187ms 0.2070ms 4.8314 KOps/s 4.8809 KOps/s $\color{#d91a1a}-1.02\%$
test_stack 29.6065ms 25.6960ms 38.9166 Ops/s 37.6930 Ops/s $\color{#35bf28}+3.25\%$
test_cat 32.1246ms 25.5382ms 39.1571 Ops/s 37.7916 Ops/s $\color{#35bf28}+3.61\%$

@vmoens vmoens added the bug Something isn't working label Dec 20, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}37$. Worsened: $\large\color{#d91a1a}19$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 29.1300μs 11.3320μs 88.2454 KOps/s 77.3227 KOps/s $\textbf{\color{#35bf28}+14.13\%}$
test_plain_set_stack_nested 34.6600μs 11.5641μs 86.4748 KOps/s 76.9792 KOps/s $\textbf{\color{#35bf28}+12.34\%}$
test_plain_set_nested_inplace 50.1610μs 12.3492μs 80.9769 KOps/s 71.5199 KOps/s $\textbf{\color{#35bf28}+13.22\%}$
test_plain_set_stack_nested_inplace 36.7600μs 12.4011μs 80.6383 KOps/s 71.4482 KOps/s $\textbf{\color{#35bf28}+12.86\%}$
test_items 31.0510μs 3.0269μs 330.3755 KOps/s 343.8442 KOps/s $\color{#d91a1a}-3.92\%$
test_items_nested 0.4285ms 0.3608ms 2.7713 KOps/s 2.7681 KOps/s $\color{#35bf28}+0.11\%$
test_items_nested_locked 0.4328ms 0.3646ms 2.7430 KOps/s 2.7422 KOps/s $\color{#35bf28}+0.03\%$
test_items_nested_leaf 94.4810μs 58.5305μs 17.0851 KOps/s 16.9333 KOps/s $\color{#35bf28}+0.90\%$
test_items_stack_nested 0.4386ms 0.3637ms 2.7493 KOps/s 2.7490 KOps/s $+0.01\%$
test_items_stack_nested_leaf 0.1058ms 59.9101μs 16.6917 KOps/s 16.7665 KOps/s $\color{#d91a1a}-0.45\%$
test_items_stack_nested_locked 0.4800ms 0.3671ms 2.7238 KOps/s 2.7596 KOps/s $\color{#d91a1a}-1.30\%$
test_keys 50.4010μs 3.4795μs 287.4014 KOps/s 287.6762 KOps/s $\color{#d91a1a}-0.10\%$
test_keys_nested 0.1253ms 81.9007μs 12.2099 KOps/s 12.2341 KOps/s $\color{#d91a1a}-0.20\%$
test_keys_nested_locked 0.6676ms 88.3618μs 11.3171 KOps/s 11.5077 KOps/s $\color{#d91a1a}-1.66\%$
test_keys_nested_leaf 0.1125ms 72.7930μs 13.7376 KOps/s 13.9038 KOps/s $\color{#d91a1a}-1.20\%$
test_keys_stack_nested 0.1238ms 82.3374μs 12.1451 KOps/s 12.3148 KOps/s $\color{#d91a1a}-1.38\%$
test_keys_stack_nested_leaf 0.1091ms 73.2133μs 13.6587 KOps/s 13.8653 KOps/s $\color{#d91a1a}-1.49\%$
test_keys_stack_nested_locked 0.1268ms 87.2929μs 11.4557 KOps/s 11.3985 KOps/s $\color{#35bf28}+0.50\%$
test_values 4.8183μs 0.8532μs 1.1720 MOps/s 1.1688 MOps/s $\color{#35bf28}+0.27\%$
test_values_nested 64.4410μs 34.9259μs 28.6321 KOps/s 29.0765 KOps/s $\color{#d91a1a}-1.53\%$
test_values_nested_locked 63.7910μs 36.5849μs 27.3337 KOps/s 27.4132 KOps/s $\color{#d91a1a}-0.29\%$
test_values_nested_leaf 63.3610μs 39.5881μs 25.2601 KOps/s 25.6103 KOps/s $\color{#d91a1a}-1.37\%$
test_values_stack_nested 68.7710μs 35.1180μs 28.4754 KOps/s 29.1819 KOps/s $\color{#d91a1a}-2.42\%$
test_values_stack_nested_leaf 70.6310μs 39.9432μs 25.0356 KOps/s 25.5580 KOps/s $\color{#d91a1a}-2.04\%$
test_values_stack_nested_locked 70.0410μs 36.8086μs 27.1676 KOps/s 27.5533 KOps/s $\color{#d91a1a}-1.40\%$
test_membership 1.5601μs 0.5009μs 1.9964 MOps/s 1.9806 MOps/s $\color{#35bf28}+0.80\%$
test_membership_nested 16.5405μs 1.9847μs 503.8568 KOps/s 493.0970 KOps/s $\color{#35bf28}+2.18\%$
test_membership_nested_leaf 20.1505μs 1.9891μs 502.7431 KOps/s 495.0701 KOps/s $\color{#35bf28}+1.55\%$
test_membership_stacked_nested 31.4400μs 2.0414μs 489.8537 KOps/s 477.1118 KOps/s $\color{#35bf28}+2.67\%$
test_membership_stacked_nested_leaf 27.7200μs 2.0516μs 487.4239 KOps/s 480.3698 KOps/s $\color{#35bf28}+1.47\%$
test_membership_nested_last 26.7110μs 3.0623μs 326.5563 KOps/s 327.7002 KOps/s $\color{#d91a1a}-0.35\%$
test_membership_nested_leaf_last 28.8210μs 3.0576μs 327.0512 KOps/s 330.4695 KOps/s $\color{#d91a1a}-1.03\%$
test_membership_stacked_nested_last 71.8510μs 5.1761μs 193.1949 KOps/s 322.6710 KOps/s $\textbf{\color{#d91a1a}-40.13\%}$
test_membership_stacked_nested_leaf_last 32.9900μs 5.4736μs 182.6956 KOps/s 325.3755 KOps/s $\textbf{\color{#d91a1a}-43.85\%}$
test_nested_getleaf 34.9710μs 6.1520μs 162.5485 KOps/s 161.3706 KOps/s $\color{#35bf28}+0.73\%$
test_nested_get 31.6410μs 5.7893μs 172.7338 KOps/s 169.2203 KOps/s $\color{#35bf28}+2.08\%$
test_stacked_getleaf 40.0500μs 6.1143μs 163.5505 KOps/s 162.7565 KOps/s $\color{#35bf28}+0.49\%$
test_stacked_get 33.5100μs 5.8131μs 172.0253 KOps/s 168.8158 KOps/s $\color{#35bf28}+1.90\%$
test_nested_getitemleaf 27.3010μs 6.1701μs 162.0722 KOps/s 159.3822 KOps/s $\color{#35bf28}+1.69\%$
test_nested_getitem 32.0100μs 5.9204μs 168.9068 KOps/s 165.3578 KOps/s $\color{#35bf28}+2.15\%$
test_stacked_getitemleaf 35.1900μs 6.2186μs 160.8084 KOps/s 158.1202 KOps/s $\color{#35bf28}+1.70\%$
test_stacked_getitem 29.6400μs 5.9317μs 168.5867 KOps/s 168.6925 KOps/s $\color{#d91a1a}-0.06\%$
test_lock_nested 0.7368ms 0.3878ms 2.5784 KOps/s 2.6584 KOps/s $\color{#d91a1a}-3.01\%$
test_lock_stack_nested 0.4024ms 0.3491ms 2.8641 KOps/s 2.8727 KOps/s $\color{#d91a1a}-0.30\%$
test_unlock_nested 0.7632ms 0.3271ms 3.0570 KOps/s 3.1480 KOps/s $\color{#d91a1a}-2.89\%$
test_unlock_stack_nested 0.3483ms 0.2898ms 3.4503 KOps/s 3.5042 KOps/s $\color{#d91a1a}-1.54\%$
test_flatten_speed 0.1098ms 75.3273μs 13.2754 KOps/s 13.1769 KOps/s $\color{#35bf28}+0.75\%$
test_unflatten_speed 0.4070ms 0.3307ms 3.0242 KOps/s 3.0704 KOps/s $\color{#d91a1a}-1.51\%$
test_common_ops 92.6937ms 0.6665ms 1.5004 KOps/s 1.5839 KOps/s $\textbf{\color{#d91a1a}-5.27\%}$
test_creation 41.8310μs 1.7750μs 563.3747 KOps/s 567.4342 KOps/s $\color{#d91a1a}-0.72\%$
test_creation_empty 36.9000μs 6.4900μs 154.0837 KOps/s 104.2303 KOps/s $\textbf{\color{#35bf28}+47.83\%}$
test_creation_nested_1 35.2210μs 8.0859μs 123.6716 KOps/s 89.1220 KOps/s $\textbf{\color{#35bf28}+38.77\%}$
test_creation_nested_2 38.1610μs 10.7962μs 92.6256 KOps/s 70.9426 KOps/s $\textbf{\color{#35bf28}+30.56\%}$
test_clone 77.1210μs 11.8023μs 84.7290 KOps/s 90.9241 KOps/s $\textbf{\color{#d91a1a}-6.81\%}$
test_getitem[int] 1.4981ms 11.3715μs 87.9394 KOps/s 91.1355 KOps/s $\color{#d91a1a}-3.51\%$
test_getitem[slice_int] 0.1101ms 22.2585μs 44.9267 KOps/s 46.4341 KOps/s $\color{#d91a1a}-3.25\%$
test_getitem[range] 0.1332ms 40.3998μs 24.7526 KOps/s 23.7559 KOps/s $\color{#35bf28}+4.20\%$
test_getitem[tuple] 0.1070ms 19.2773μs 51.8744 KOps/s 53.3353 KOps/s $\color{#d91a1a}-2.74\%$
test_getitem[list] 0.1996ms 35.6147μs 28.0783 KOps/s 27.2186 KOps/s $\color{#35bf28}+3.16\%$
test_setitem_dim[int] 47.1410μs 20.7258μs 48.2491 KOps/s 52.3911 KOps/s $\textbf{\color{#d91a1a}-7.91\%}$
test_setitem_dim[slice_int] 69.0110μs 40.6066μs 24.6265 KOps/s 24.9103 KOps/s $\color{#d91a1a}-1.14\%$
test_setitem_dim[range] 88.6010μs 55.1287μs 18.1394 KOps/s 18.3721 KOps/s $\color{#d91a1a}-1.27\%$
test_setitem_dim[tuple] 62.5900μs 35.2791μs 28.3454 KOps/s 29.9938 KOps/s $\textbf{\color{#d91a1a}-5.50\%}$
test_setitem 73.2110μs 15.6076μs 64.0713 KOps/s 61.3770 KOps/s $\color{#35bf28}+4.39\%$
test_set 84.2610μs 15.0657μs 66.3759 KOps/s 61.0547 KOps/s $\textbf{\color{#35bf28}+8.72\%}$
test_set_shared 1.4826ms 0.1517ms 6.5927 KOps/s 6.6352 KOps/s $\color{#d91a1a}-0.64\%$
test_update 0.4516ms 17.5121μs 57.1034 KOps/s 51.9762 KOps/s $\textbf{\color{#35bf28}+9.86\%}$
test_update_nested 56.2810μs 24.0456μs 41.5876 KOps/s 38.9066 KOps/s $\textbf{\color{#35bf28}+6.89\%}$
test_update__nested 0.1426ms 30.0307μs 33.2993 KOps/s 37.6295 KOps/s $\textbf{\color{#d91a1a}-11.51\%}$
test_set_nested 77.1910μs 18.0360μs 55.4446 KOps/s 58.5592 KOps/s $\textbf{\color{#d91a1a}-5.32\%}$
test_set_nested_new 83.9910μs 20.8818μs 47.8886 KOps/s 51.9485 KOps/s $\textbf{\color{#d91a1a}-7.82\%}$
test_select 94.4610μs 32.7374μs 30.5461 KOps/s 31.1255 KOps/s $\color{#d91a1a}-1.86\%$
test_select_nested 0.1052ms 44.8288μs 22.3071 KOps/s 22.6170 KOps/s $\color{#d91a1a}-1.37\%$
test_exclude_nested 0.1032ms 64.1379μs 15.5914 KOps/s 15.7336 KOps/s $\color{#d91a1a}-0.90\%$
test_empty[True] 0.3655ms 0.2876ms 3.4766 KOps/s 3.4742 KOps/s $\color{#35bf28}+0.07\%$
test_empty[False] 3.2601μs 0.8245μs 1.2128 MOps/s 1.1948 MOps/s $\color{#35bf28}+1.51\%$
test_to 88.3020μs 57.8166μs 17.2961 KOps/s 17.3867 KOps/s $\color{#d91a1a}-0.52\%$
test_to_nonblocking 92.1610μs 49.0463μs 20.3889 KOps/s 20.5756 KOps/s $\color{#d91a1a}-0.91\%$
test_unbind_speed 1.2878ms 0.2490ms 4.0162 KOps/s 4.1716 KOps/s $\color{#d91a1a}-3.73\%$
test_unbind_speed_stack0 0.3074ms 0.2436ms 4.1048 KOps/s 4.1861 KOps/s $\color{#d91a1a}-1.94\%$
test_unbind_speed_stack1 92.9007ms 0.6808ms 1.4689 KOps/s 1.4454 KOps/s $\color{#35bf28}+1.63\%$
test_split 93.6687ms 1.6750ms 597.0306 Ops/s 622.8061 Ops/s $\color{#d91a1a}-4.14\%$
test_chunk 96.1086ms 1.6819ms 594.5527 Ops/s 570.1777 Ops/s $\color{#35bf28}+4.27\%$
test_consolidate[False-None] 96.2686ms 3.0134ms 331.8543 Ops/s 372.4304 Ops/s $\textbf{\color{#d91a1a}-10.89\%}$
test_consolidate[default-None] 1.9049ms 1.7984ms 556.0595 Ops/s 581.1129 Ops/s $\color{#d91a1a}-4.31\%$
test_consolidate[reduce-overhead-None] 1.9539ms 1.8319ms 545.8926 Ops/s 570.2507 Ops/s $\color{#d91a1a}-4.27\%$
test_consolidate_njt[False-None] 7.1501ms 6.7015ms 149.2206 Ops/s 152.0818 Ops/s $\color{#d91a1a}-1.88\%$
test_to[False-False-None] 1.8696ms 1.7739ms 563.7375 Ops/s 580.3762 Ops/s $\color{#d91a1a}-2.87\%$
test_to[True-False-None] 1.6088ms 1.3863ms 721.3352 Ops/s 740.5785 Ops/s $\color{#d91a1a}-2.60\%$
test_to[within-False-None] 0.2971s 5.5105ms 181.4721 Ops/s 242.4825 Ops/s $\textbf{\color{#d91a1a}-25.16\%}$
test_to[True-default-None] 5.7975ms 5.4437ms 183.6972 Ops/s 183.1065 Ops/s $\color{#35bf28}+0.32\%$
test_to_njt[False-False-None] 7.6893ms 7.3016ms 136.9561 Ops/s 143.3603 Ops/s $\color{#d91a1a}-4.47\%$
test_to_njt[True-False-None] 6.2226ms 5.7580ms 173.6706 Ops/s 178.4755 Ops/s $\color{#d91a1a}-2.69\%$
test_to_njt[within-False-None] 13.4956ms 12.7382ms 78.5037 Ops/s 81.7605 Ops/s $\color{#d91a1a}-3.98\%$
test_creation[device0] 0.5387ms 80.9022μs 12.3606 KOps/s 12.1091 KOps/s $\color{#35bf28}+2.08\%$
test_creation_from_tensor 0.5669ms 83.7497μs 11.9403 KOps/s 11.5002 KOps/s $\color{#35bf28}+3.83\%$
test_add_one[memmap_tensor0] 0.1578ms 7.5456μs 132.5268 KOps/s 142.0836 KOps/s $\textbf{\color{#d91a1a}-6.73\%}$
test_contiguous[memmap_tensor0] 1.7180μs 0.4300μs 2.3254 MOps/s 2.2579 MOps/s $\color{#35bf28}+2.99\%$
test_stack[memmap_tensor0] 38.4900μs 4.8034μs 208.1880 KOps/s 221.0406 KOps/s $\textbf{\color{#d91a1a}-5.81\%}$
test_memmaptd_index 1.8078ms 0.2643ms 3.7831 KOps/s 3.7331 KOps/s $\color{#35bf28}+1.34\%$
test_memmaptd_index_astensor 0.6115ms 0.3260ms 3.0678 KOps/s 3.0351 KOps/s $\color{#35bf28}+1.08\%$
test_memmaptd_index_op 1.0138ms 0.5956ms 1.6790 KOps/s 1.5666 KOps/s $\textbf{\color{#35bf28}+7.17\%}$
test_serialize_model 0.1318s 0.1311s 7.6281 Ops/s 7.6447 Ops/s $\color{#d91a1a}-0.22\%$
test_serialize_model_pickle 1.3485s 1.1910s 0.8396 Ops/s 0.8224 Ops/s $\color{#35bf28}+2.09\%$
test_serialize_weights 0.1311s 0.1301s 7.6879 Ops/s 7.7259 Ops/s $\color{#d91a1a}-0.49\%$
test_serialize_weights_returnearly 0.3331s 61.7074ms 16.2055 Ops/s 15.2690 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_serialize_weights_pickle 1.3794s 1.2171s 0.8216 Ops/s 0.8198 Ops/s $\color{#35bf28}+0.22\%$
test_reshape_pytree 59.3300μs 22.7734μs 43.9109 KOps/s 43.7187 KOps/s $\color{#35bf28}+0.44\%$
test_reshape_td 53.4500μs 27.5737μs 36.2664 KOps/s 36.1430 KOps/s $\color{#35bf28}+0.34\%$
test_view_pytree 54.1710μs 22.3634μs 44.7160 KOps/s 44.8986 KOps/s $\color{#d91a1a}-0.41\%$
test_view_td 61.1010μs 31.5618μs 31.6838 KOps/s 31.2798 KOps/s $\color{#35bf28}+1.29\%$
test_unbind_pytree 57.6810μs 28.9466μs 34.5463 KOps/s 35.4545 KOps/s $\color{#d91a1a}-2.56\%$
test_unbind_td 0.5981ms 37.5862μs 26.6055 KOps/s 26.7704 KOps/s $\color{#d91a1a}-0.62\%$
test_split_pytree 50.2510μs 30.7302μs 32.5412 KOps/s 32.9877 KOps/s $\color{#d91a1a}-1.35\%$
test_split_td 0.7615ms 40.3434μs 24.7872 KOps/s 25.4150 KOps/s $\color{#d91a1a}-2.47\%$
test_add_pytree 74.9910μs 36.4404μs 27.4420 KOps/s 28.3350 KOps/s $\color{#d91a1a}-3.15\%$
test_add_td 84.9510μs 49.8957μs 20.0418 KOps/s 19.2686 KOps/s $\color{#35bf28}+4.01\%$
test_compile_add_one_nested[tensordict-compile] 0.1722ms 0.1203ms 8.3145 KOps/s 8.1523 KOps/s $\color{#35bf28}+1.99\%$
test_compile_add_one_nested[tensordict-eager] 0.2255ms 0.1294ms 7.7305 KOps/s 7.5907 KOps/s $\color{#35bf28}+1.84\%$
test_compile_add_one_nested[pytree-compile] 0.2040ms 97.0014μs 10.3091 KOps/s 10.3255 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_add_one_nested[pytree-eager] 1.0322ms 0.1514ms 6.6048 KOps/s 6.6768 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_copy_nested[tensordict-compile] 58.7010μs 24.0383μs 41.6003 KOps/s 42.4494 KOps/s $\color{#d91a1a}-2.00\%$
test_compile_copy_nested[tensordict-eager] 55.0710μs 29.5605μs 33.8289 KOps/s 33.4164 KOps/s $\color{#35bf28}+1.23\%$
test_compile_copy_nested[pytree-compile] 0.4096ms 65.3877μs 15.2934 KOps/s 15.2281 KOps/s $\color{#35bf28}+0.43\%$
test_compile_copy_nested[pytree-eager] 79.9610μs 49.1062μs 20.3640 KOps/s 20.1996 KOps/s $\color{#35bf28}+0.81\%$
test_compile_add_one_flat[tensordict-compile] 0.1858ms 0.1419ms 7.0455 KOps/s 6.9542 KOps/s $\color{#35bf28}+1.31\%$
test_compile_add_one_flat[tensordict-eager] 0.3024ms 0.2158ms 4.6333 KOps/s 4.5861 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_one_flat[tensorclass-compile] 0.1376ms 98.3919μs 10.1634 KOps/s 9.5104 KOps/s $\textbf{\color{#35bf28}+6.87\%}$
test_compile_add_one_flat[tensorclass-eager] 0.1104ms 51.9217μs 19.2598 KOps/s 18.0740 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_compile_add_one_flat[pytree-compile] 0.2247ms 0.1352ms 7.3958 KOps/s 6.8834 KOps/s $\textbf{\color{#35bf28}+7.44\%}$
test_compile_add_one_flat[pytree-eager] 0.6232ms 0.4920ms 2.0324 KOps/s 2.0277 KOps/s $\color{#35bf28}+0.23\%$
test_compile_add_self_flat[tensordict-eager] 0.3922ms 0.2573ms 3.8861 KOps/s 3.7563 KOps/s $\color{#35bf28}+3.46\%$
test_compile_add_self_flat[tensordict-compile] 0.1839ms 0.1416ms 7.0615 KOps/s 6.6336 KOps/s $\textbf{\color{#35bf28}+6.45\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1427ms 64.4007μs 15.5278 KOps/s 14.4023 KOps/s $\textbf{\color{#35bf28}+7.81\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2454ms 98.7641μs 10.1251 KOps/s 9.6106 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_compile_add_self_flat[pytree-eager] 0.4759ms 0.4153ms 2.4080 KOps/s 2.4286 KOps/s $\color{#d91a1a}-0.85\%$
test_compile_add_self_flat[pytree-compile] 0.1737ms 0.1344ms 7.4385 KOps/s 7.2858 KOps/s $\color{#35bf28}+2.10\%$
test_compile_copy_flat[tensordict-compile] 49.7500μs 19.0732μs 52.4296 KOps/s 56.4573 KOps/s $\textbf{\color{#d91a1a}-7.13\%}$
test_compile_copy_flat[tensordict-eager] 64.8210μs 31.2363μs 32.0140 KOps/s 31.9721 KOps/s $\color{#35bf28}+0.13\%$
test_compile_copy_flat[pytree-compile] 0.1721ms 72.1503μs 13.8600 KOps/s 14.0405 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_copy_flat[pytree-eager] 80.8410μs 52.0282μs 19.2203 KOps/s 19.1618 KOps/s $\color{#35bf28}+0.31\%$
test_compile_assign_and_add[tensordict-compile] 1.7098ms 0.4132ms 2.4200 KOps/s 2.2195 KOps/s $\textbf{\color{#35bf28}+9.03\%}$
test_compile_assign_and_add[tensordict-eager] 2.8297ms 2.7156ms 368.2404 Ops/s 389.0587 Ops/s $\textbf{\color{#d91a1a}-5.35\%}$
test_compile_assign_and_add[pytree-compile] 1.6267ms 0.3879ms 2.5779 KOps/s 2.2545 KOps/s $\textbf{\color{#35bf28}+14.35\%}$
test_compile_assign_and_add[pytree-eager] 2.9259ms 2.7292ms 366.4126 Ops/s 379.1092 Ops/s $\color{#d91a1a}-3.35\%$
test_compile_indexing[tensor-tensordict-compile] 0.1690ms 0.1145ms 8.7349 KOps/s 8.3787 KOps/s $\color{#35bf28}+4.25\%$
test_compile_indexing[tensor-tensordict-eager] 0.5711ms 80.8013μs 12.3760 KOps/s 11.9996 KOps/s $\color{#35bf28}+3.14\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5132ms 0.1068ms 9.3667 KOps/s 8.8303 KOps/s $\textbf{\color{#35bf28}+6.07\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.5017ms 68.9682μs 14.4994 KOps/s 13.7488 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_compile_indexing[tensor-pytree-compile] 0.5016ms 0.1074ms 9.3082 KOps/s 8.6706 KOps/s $\textbf{\color{#35bf28}+7.35\%}$
test_compile_indexing[tensor-pytree-eager] 0.4852ms 69.1518μs 14.4609 KOps/s 13.5801 KOps/s $\textbf{\color{#35bf28}+6.49\%}$
test_compile_indexing[slice-tensordict-compile] 0.5380ms 0.1015ms 9.8516 KOps/s 9.7704 KOps/s $\color{#35bf28}+0.83\%$
test_compile_indexing[slice-tensordict-eager] 0.3988ms 18.2817μs 54.6995 KOps/s 54.2148 KOps/s $\color{#35bf28}+0.89\%$
test_compile_indexing[slice-tensorclass-compile] 0.1448ms 97.6410μs 10.2416 KOps/s 10.2621 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[slice-tensorclass-eager] 0.4121ms 16.4865μs 60.6558 KOps/s 62.7355 KOps/s $\color{#d91a1a}-3.32\%$
test_compile_indexing[slice-pytree-compile] 0.1459ms 98.5976μs 10.1422 KOps/s 10.2106 KOps/s $\color{#d91a1a}-0.67\%$
test_compile_indexing[slice-pytree-eager] 0.4029ms 16.4532μs 60.7785 KOps/s 62.5451 KOps/s $\color{#d91a1a}-2.82\%$
test_compile_indexing[int-tensordict-compile] 0.5281ms 0.1025ms 9.7544 KOps/s 9.7736 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[int-tensordict-eager] 0.5715ms 17.9285μs 55.7772 KOps/s 53.4385 KOps/s $\color{#35bf28}+4.38\%$
test_compile_indexing[int-tensorclass-compile] 0.4789ms 98.1179μs 10.1918 KOps/s 9.9429 KOps/s $\color{#35bf28}+2.50\%$
test_compile_indexing[int-tensorclass-eager] 49.5900μs 16.5403μs 60.4583 KOps/s 62.7027 KOps/s $\color{#d91a1a}-3.58\%$
test_compile_indexing[int-pytree-compile] 0.4979ms 98.9629μs 10.1048 KOps/s 10.2423 KOps/s $\color{#d91a1a}-1.34\%$
test_compile_indexing[int-pytree-eager] 0.3923ms 18.0923μs 55.2720 KOps/s 63.5479 KOps/s $\textbf{\color{#d91a1a}-13.02\%}$
test_mod_add[eager] 0.4346ms 36.4460μs 27.4378 KOps/s 24.5984 KOps/s $\textbf{\color{#35bf28}+11.54\%}$
test_mod_add[compile] 0.1361ms 79.6201μs 12.5596 KOps/s 12.2752 KOps/s $\color{#35bf28}+2.32\%$
test_mod_add[compile-overhead] 0.3269ms 0.1695ms 5.8997 KOps/s 5.3802 KOps/s $\textbf{\color{#35bf28}+9.66\%}$
test_mod_wrap[eager] 0.6521ms 0.2511ms 3.9817 KOps/s 3.7211 KOps/s $\textbf{\color{#35bf28}+7.00\%}$
test_mod_wrap[compile] 0.3577ms 0.2951ms 3.3882 KOps/s 3.4392 KOps/s $\color{#d91a1a}-1.48\%$
test_mod_wrap[compile-overhead] 7.3170ms 3.7619ms 265.8232 Ops/s 269.3622 Ops/s $\color{#d91a1a}-1.31\%$
test_mod_wrap_and_backward[eager] 1.5133ms 1.3955ms 716.5991 Ops/s 679.9638 Ops/s $\textbf{\color{#35bf28}+5.39\%}$
test_mod_wrap_and_backward[compile] 1.4014ms 1.2851ms 778.1351 Ops/s 726.2019 Ops/s $\textbf{\color{#35bf28}+7.15\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3980ms 0.9426ms 1.0609 KOps/s 943.8862 Ops/s $\textbf{\color{#35bf28}+12.40\%}$
test_seq_add[eager] 0.1846ms 0.1177ms 8.4971 KOps/s 8.1137 KOps/s $\color{#35bf28}+4.73\%$
test_seq_add[compile] 0.2263ms 89.3349μs 11.1938 KOps/s 11.2853 KOps/s $\color{#d91a1a}-0.81\%$
test_seq_add[compile-overhead] 0.2001ms 0.1315ms 7.6027 KOps/s 7.7911 KOps/s $\color{#d91a1a}-2.42\%$
test_seq_wrap[eager] 0.5026ms 0.4215ms 2.3727 KOps/s 2.3267 KOps/s $\color{#35bf28}+1.98\%$
test_seq_wrap[compile] 0.4481ms 0.3116ms 3.2095 KOps/s 3.2815 KOps/s $\color{#d91a1a}-2.20\%$
test_seq_wrap[compile-overhead] 0.2939ms 0.2266ms 4.4138 KOps/s 4.4234 KOps/s $\color{#d91a1a}-0.22\%$
test_func_call_runtime[False-eager] 0.8239ms 0.7466ms 1.3394 KOps/s 1.3344 KOps/s $\color{#35bf28}+0.38\%$
test_func_call_runtime[False-compile] 0.8649ms 0.7594ms 1.3168 KOps/s 1.3271 KOps/s $\color{#d91a1a}-0.78\%$
test_func_call_runtime[False-compile-overhead] 0.4677ms 0.3710ms 2.6956 KOps/s 2.7116 KOps/s $\color{#d91a1a}-0.59\%$
test_func_call_runtime[True-eager] 0.9839ms 0.9071ms 1.1024 KOps/s 1.0817 KOps/s $\color{#35bf28}+1.92\%$
test_func_call_runtime[True-compile] 1.0311ms 0.7810ms 1.2803 KOps/s 1.2989 KOps/s $\color{#d91a1a}-1.43\%$
test_func_call_runtime[True-compile-overhead] 0.4483ms 0.3925ms 2.5480 KOps/s 2.5588 KOps/s $\color{#d91a1a}-0.42\%$
test_func_call_cm_runtime[False-eager] 0.8131ms 0.7349ms 1.3608 KOps/s 1.3286 KOps/s $\color{#35bf28}+2.43\%$
test_func_call_cm_runtime[False-compile] 0.8345ms 0.7657ms 1.3061 KOps/s 1.3269 KOps/s $\color{#d91a1a}-1.57\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4276ms 0.3742ms 2.6726 KOps/s 2.6930 KOps/s $\color{#d91a1a}-0.76\%$
test_func_call_cm_runtime[True-eager] 1.1230ms 1.0120ms 988.1021 Ops/s 986.7000 Ops/s $\color{#35bf28}+0.14\%$
test_func_call_cm_runtime[True-compile] 0.8803ms 0.8067ms 1.2396 KOps/s 1.2540 KOps/s $\color{#d91a1a}-1.15\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4727ms 0.4183ms 2.3905 KOps/s 2.4012 KOps/s $\color{#d91a1a}-0.45\%$
test_vmap_func_call_cm_runtime[eager] 2.5530ms 2.1005ms 476.0852 Ops/s 473.7864 Ops/s $\color{#35bf28}+0.49\%$
test_vmap_func_call_cm_runtime[compile] 0.8953ms 0.8249ms 1.2123 KOps/s 1.2165 KOps/s $\color{#d91a1a}-0.35\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5000ms 0.4208ms 2.3766 KOps/s 2.3860 KOps/s $\color{#d91a1a}-0.40\%$
test_distributed 5.1120ms 0.1987ms 5.0324 KOps/s 8.4539 KOps/s $\textbf{\color{#d91a1a}-40.47\%}$
test_tdmodule 0.1164ms 18.9894μs 52.6609 KOps/s 48.5413 KOps/s $\textbf{\color{#35bf28}+8.49\%}$
test_tdmodule_dispatch 72.6010μs 33.4604μs 29.8861 KOps/s 26.8452 KOps/s $\textbf{\color{#35bf28}+11.33\%}$
test_tdseq 52.7900μs 19.2790μs 51.8699 KOps/s 45.7560 KOps/s $\textbf{\color{#35bf28}+13.36\%}$
test_tdseq_dispatch 68.2210μs 36.4621μs 27.4257 KOps/s 24.5732 KOps/s $\textbf{\color{#35bf28}+11.61\%}$
test_instantiation_functorch 1.7442ms 1.6174ms 618.2754 Ops/s 629.4560 Ops/s $\color{#d91a1a}-1.78\%$
test_exec_functorch 0.2038ms 0.1510ms 6.6220 KOps/s 6.7018 KOps/s $\color{#d91a1a}-1.19\%$
test_exec_functional_call 0.2065ms 0.1465ms 6.8273 KOps/s 7.0389 KOps/s $\color{#d91a1a}-3.01\%$
test_exec_td_decorator 0.3867ms 0.1933ms 5.1737 KOps/s 5.2208 KOps/s $\color{#d91a1a}-0.90\%$
test_vmap_mlp_speed_decorator[True-True] 0.7920ms 0.6940ms 1.4409 KOps/s 1.4455 KOps/s $\color{#d91a1a}-0.32\%$
test_vmap_mlp_speed_decorator[True-False] 1.2109ms 0.7054ms 1.4176 KOps/s 1.4408 KOps/s $\color{#d91a1a}-1.61\%$
test_vmap_mlp_speed_decorator[False-True] 0.7154ms 0.5995ms 1.6681 KOps/s 1.6508 KOps/s $\color{#35bf28}+1.05\%$
test_vmap_mlp_speed_decorator[False-False] 0.7419ms 0.6161ms 1.6230 KOps/s 1.6524 KOps/s $\color{#d91a1a}-1.78\%$
test_vmap_transformer_speed_decorator[True-True] 20.2150ms 19.4185ms 51.4972 Ops/s 51.5886 Ops/s $\color{#d91a1a}-0.18\%$
test_vmap_transformer_speed_decorator[True-False] 19.9519ms 19.4012ms 51.5433 Ops/s 51.7415 Ops/s $\color{#d91a1a}-0.38\%$
test_vmap_transformer_speed_decorator[False-True] 19.6037ms 19.3311ms 51.7300 Ops/s 52.1010 Ops/s $\color{#d91a1a}-0.71\%$
test_vmap_transformer_speed_decorator[False-False] 20.0652ms 19.3444ms 51.6945 Ops/s 52.1088 Ops/s $\color{#d91a1a}-0.79\%$
test_to_module_speed[True] 2.2485ms 0.9839ms 1.0163 KOps/s 1.0259 KOps/s $\color{#d91a1a}-0.93\%$
test_to_module_speed[False] 1.0485ms 0.9719ms 1.0289 KOps/s 1.0435 KOps/s $\color{#d91a1a}-1.39\%$
test_tc_init 63.6610μs 35.0377μs 28.5407 KOps/s 24.9851 KOps/s $\textbf{\color{#35bf28}+14.23\%}$
test_tc_init_nested 99.9420μs 69.9603μs 14.2938 KOps/s 12.3082 KOps/s $\textbf{\color{#35bf28}+16.13\%}$
test_tc_first_layer_tensor 24.3610μs 0.8134μs 1.2294 MOps/s 1.3825 MOps/s $\textbf{\color{#d91a1a}-11.07\%}$
test_tc_first_layer_nontensor 0.1235ms 2.3616μs 423.4462 KOps/s 421.6534 KOps/s $\color{#35bf28}+0.43\%$
test_tc_second_layer_tensor 11.1900μs 1.4452μs 691.9628 KOps/s 675.2856 KOps/s $\color{#35bf28}+2.47\%$
test_tc_second_layer_nontensor 29.4000μs 3.1074μs 321.8120 KOps/s 316.4510 KOps/s $\color{#35bf28}+1.69\%$
test_unbind 0.2171s 12.5049ms 79.9687 Ops/s 138.7363 Ops/s $\textbf{\color{#d91a1a}-42.36\%}$
test_full_like 10.3403ms 9.1250ms 109.5894 Ops/s 106.6941 Ops/s $\color{#35bf28}+2.71\%$
test_zeros_like 4.9093ms 4.2532ms 235.1162 Ops/s 235.6532 Ops/s $\color{#d91a1a}-0.23\%$
test_ones_like 5.5125ms 4.3185ms 231.5645 Ops/s 236.3003 Ops/s $\color{#d91a1a}-2.00\%$
test_clone 6.7610ms 6.3699ms 156.9878 Ops/s 157.5118 Ops/s $\color{#d91a1a}-0.33\%$
test_squeeze 60.8000μs 9.6295μs 103.8481 KOps/s 102.9121 KOps/s $\color{#35bf28}+0.91\%$
test_unsqueeze 0.4651ms 74.7656μs 13.3751 KOps/s 13.4164 KOps/s $\color{#d91a1a}-0.31\%$
test_split 0.3894ms 0.1613ms 6.1997 KOps/s 5.9186 KOps/s $\color{#35bf28}+4.75\%$
test_permute 0.5679ms 0.1757ms 5.6902 KOps/s 5.3391 KOps/s $\textbf{\color{#35bf28}+6.58\%}$
test_stack 51.1117ms 50.6642ms 19.7378 Ops/s 19.7201 Ops/s $\color{#35bf28}+0.09\%$
test_cat 50.7858ms 50.5087ms 19.7986 Ops/s 19.7454 Ops/s $\color{#35bf28}+0.27\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants