-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Make _to_consolidated compatible with compile #1041
Open
vmoens
wants to merge
42
commits into
gh/vmoens/30/base
Choose a base branch
from
gh/vmoens/30/head
base: gh/vmoens/30/base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 14, 2024
ghstack-source-id: 17f1ce893b6f14b990b59703447301494b5f7585 Pull Request resolved: #1041
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 14, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 52.9890μs | 25.4551μs | 39.2848 KOps/s | 43.0610 KOps/s | |
test_plain_set_stack_nested | 64.1190μs | 25.7050μs | 38.9029 KOps/s | 42.2214 KOps/s | |
test_plain_set_nested_inplace | 63.6190μs | 27.9470μs | 35.7821 KOps/s | 38.1507 KOps/s | |
test_plain_set_stack_nested_inplace | 79.0580μs | 27.8434μs | 35.9152 KOps/s | 38.8198 KOps/s | |
test_items | 32.0100μs | 4.2019μs | 237.9855 KOps/s | 230.9299 KOps/s | |
test_items_nested | 0.6203ms | 0.3820ms | 2.6175 KOps/s | 2.5896 KOps/s | |
test_items_nested_locked | 0.7863ms | 0.3818ms | 2.6192 KOps/s | 2.5625 KOps/s | |
test_items_nested_leaf | 0.1728ms | 82.6565μs | 12.0983 KOps/s | 11.9132 KOps/s | |
test_items_stack_nested | 0.7554ms | 0.3857ms | 2.5926 KOps/s | 2.5378 KOps/s | |
test_items_stack_nested_leaf | 0.1384ms | 86.8283μs | 11.5170 KOps/s | 11.5349 KOps/s | |
test_items_stack_nested_locked | 0.6990ms | 0.3853ms | 2.5956 KOps/s | 2.5657 KOps/s | |
test_keys | 29.7950μs | 3.6251μs | 275.8541 KOps/s | 282.6565 KOps/s | |
test_keys_nested | 0.2512ms | 0.1358ms | 7.3630 KOps/s | 7.3588 KOps/s | |
test_keys_nested_locked | 1.9553ms | 0.1401ms | 7.1396 KOps/s | 7.0699 KOps/s | |
test_keys_nested_leaf | 0.2190ms | 0.1181ms | 8.4708 KOps/s | 8.4154 KOps/s | |
test_keys_stack_nested | 0.2928ms | 0.1352ms | 7.3988 KOps/s | 7.2105 KOps/s | |
test_keys_stack_nested_leaf | 0.1928ms | 0.1170ms | 8.5450 KOps/s | 8.6941 KOps/s | |
test_keys_stack_nested_locked | 0.2701ms | 0.1406ms | 7.1099 KOps/s | 7.2680 KOps/s | |
test_values | 6.2576μs | 1.0841μs | 922.3863 KOps/s | 949.6671 KOps/s | |
test_values_nested | 0.2293ms | 95.4603μs | 10.4756 KOps/s | 10.6316 KOps/s | |
test_values_nested_locked | 0.1780ms | 94.5504μs | 10.5764 KOps/s | 10.6406 KOps/s | |
test_values_nested_leaf | 0.1490ms | 81.1804μs | 12.3182 KOps/s | 12.2465 KOps/s | |
test_values_stack_nested | 0.1953ms | 96.1008μs | 10.4057 KOps/s | 9.9028 KOps/s | |
test_values_stack_nested_leaf | 0.1680ms | 80.2057μs | 12.4679 KOps/s | 12.8044 KOps/s | |
test_values_stack_nested_locked | 0.1409ms | 96.7402μs | 10.3370 KOps/s | 10.2664 KOps/s | |
test_membership | 28.2430μs | 0.8855μs | 1.1293 MOps/s | 1.3661 MOps/s | |
test_membership_nested | 23.2040μs | 2.7531μs | 363.2264 KOps/s | 360.3262 KOps/s | |
test_membership_nested_leaf | 15.3390μs | 2.7468μs | 364.0609 KOps/s | 353.8370 KOps/s | |
test_membership_stacked_nested | 26.9410μs | 2.7066μs | 369.4647 KOps/s | 357.7467 KOps/s | |
test_membership_stacked_nested_leaf | 19.5670μs | 2.7776μs | 360.0264 KOps/s | 355.9627 KOps/s | |
test_membership_nested_last | 25.3570μs | 4.2448μs | 235.5838 KOps/s | 231.6594 KOps/s | |
test_membership_nested_leaf_last | 31.8790μs | 4.2132μs | 237.3482 KOps/s | 231.1251 KOps/s | |
test_membership_stacked_nested_last | 27.1910μs | 4.2493μs | 235.3322 KOps/s | 73.7480 KOps/s | |
test_membership_stacked_nested_leaf_last | 32.7620μs | 4.2692μs | 234.2382 KOps/s | 73.3765 KOps/s | |
test_nested_getleaf | 0.1279ms | 10.5992μs | 94.3467 KOps/s | 93.2931 KOps/s | |
test_nested_get | 0.2423ms | 10.4769μs | 95.4483 KOps/s | 98.0869 KOps/s | |
test_stacked_getleaf | 33.9740μs | 10.6505μs | 93.8922 KOps/s | 93.1653 KOps/s | |
test_stacked_get | 31.1780μs | 10.0926μs | 99.0826 KOps/s | 98.3670 KOps/s | |
test_nested_getitemleaf | 31.3490μs | 11.2517μs | 88.8757 KOps/s | 89.3850 KOps/s | |
test_nested_getitem | 39.1530μs | 10.4481μs | 95.7111 KOps/s | 95.5306 KOps/s | |
test_stacked_getitemleaf | 30.8770μs | 11.1024μs | 90.0709 KOps/s | 89.6051 KOps/s | |
test_stacked_getitem | 30.0160μs | 10.3047μs | 97.0435 KOps/s | 97.1301 KOps/s | |
test_lock_nested | 4.6379ms | 0.5196ms | 1.9247 KOps/s | 1.9596 KOps/s | |
test_lock_stack_nested | 0.9178ms | 0.4758ms | 2.1017 KOps/s | 2.1972 KOps/s | |
test_unlock_nested | 0.8552ms | 0.4266ms | 2.3439 KOps/s | 2.3484 KOps/s | |
test_unlock_stack_nested | 0.7375ms | 0.3897ms | 2.5660 KOps/s | 2.6952 KOps/s | |
test_flatten_speed | 0.2095ms | 0.1030ms | 9.7077 KOps/s | 9.6550 KOps/s | |
test_unflatten_speed | 0.6206ms | 0.5163ms | 1.9369 KOps/s | 1.9681 KOps/s | |
test_common_ops | 6.6789ms | 1.2002ms | 833.2013 Ops/s | 892.6238 Ops/s | |
test_creation | 97.9840μs | 2.0749μs | 481.9473 KOps/s | 479.3950 KOps/s | |
test_creation_empty | 57.3870μs | 19.9241μs | 50.1905 KOps/s | 60.3916 KOps/s | |
test_creation_nested_1 | 58.5490μs | 23.2728μs | 42.9686 KOps/s | 51.1865 KOps/s | |
test_creation_nested_2 | 86.1210μs | 27.3477μs | 36.5662 KOps/s | 41.8553 KOps/s | |
test_clone | 0.1188ms | 17.5496μs | 56.9814 KOps/s | 59.1320 KOps/s | |
test_getitem[int] | 0.9578ms | 16.5803μs | 60.3127 KOps/s | 60.3631 KOps/s | |
test_getitem[slice_int] | 0.1373ms | 29.8050μs | 33.5514 KOps/s | 31.6062 KOps/s | |
test_getitem[range] | 0.2985ms | 58.4862μs | 17.0981 KOps/s | 16.9788 KOps/s | |
test_getitem[tuple] | 0.1340ms | 24.8855μs | 40.1841 KOps/s | 39.4308 KOps/s | |
test_getitem[list] | 0.2528ms | 53.5690μs | 18.6675 KOps/s | 18.4028 KOps/s | |
test_setitem_dim[int] | 66.1040μs | 33.8269μs | 29.5623 KOps/s | 29.7367 KOps/s | |
test_setitem_dim[slice_int] | 0.1082ms | 62.9000μs | 15.8983 KOps/s | 15.9641 KOps/s | |
test_setitem_dim[range] | 0.1500ms | 85.6303μs | 11.6781 KOps/s | 11.5805 KOps/s | |
test_setitem_dim[tuple] | 80.2200μs | 50.6299μs | 19.7512 KOps/s | 19.7320 KOps/s | |
test_setitem | 0.1762ms | 31.7103μs | 31.5355 KOps/s | 34.1973 KOps/s | |
test_set | 0.1923ms | 30.5383μs | 32.7458 KOps/s | 35.5979 KOps/s | |
test_set_shared | 3.6232ms | 0.2257ms | 4.4310 KOps/s | 4.5087 KOps/s | |
test_update | 0.1548ms | 40.1967μs | 24.8777 KOps/s | 27.5744 KOps/s | |
test_update_nested | 0.2451ms | 51.9502μs | 19.2492 KOps/s | 21.0201 KOps/s | |
test_update__nested | 0.4207ms | 45.3962μs | 22.0283 KOps/s | 22.2564 KOps/s | |
test_set_nested | 0.1048ms | 33.9473μs | 29.4574 KOps/s | 31.6409 KOps/s | |
test_set_nested_new | 0.1125ms | 38.5631μs | 25.9315 KOps/s | 26.7323 KOps/s | |
test_select | 0.1152ms | 56.8755μs | 17.5823 KOps/s | 18.0722 KOps/s | |
test_select_nested | 0.1416ms | 60.1958μs | 16.6125 KOps/s | 16.1821 KOps/s | |
test_exclude_nested | 0.1438ms | 75.4701μs | 13.2503 KOps/s | 12.9190 KOps/s | |
test_empty[True] | 0.5555ms | 0.3516ms | 2.8439 KOps/s | 2.7838 KOps/s | |
test_empty[False] | 10.0915μs | 1.2514μs | 799.0865 KOps/s | 814.2432 KOps/s | |
test_unbind_speed | 0.3497ms | 0.3057ms | 3.2712 KOps/s | 3.2557 KOps/s | |
test_unbind_speed_stack0 | 0.4938ms | 0.2961ms | 3.3774 KOps/s | 3.4515 KOps/s | |
test_unbind_speed_stack1 | 0.1032s | 0.7994ms | 1.2509 KOps/s | 1.3705 KOps/s | |
test_split | 2.3687ms | 1.9916ms | 502.1177 Ops/s | 454.3050 Ops/s | |
test_chunk | 0.1078s | 2.2145ms | 451.5608 Ops/s | 504.8899 Ops/s | |
test_creation[device0] | 0.2302ms | 0.1169ms | 8.5549 KOps/s | 8.5164 KOps/s | |
test_creation_from_tensor | 3.5243ms | 0.1177ms | 8.4937 KOps/s | 8.4395 KOps/s | |
test_add_one[memmap_tensor0] | 0.1133ms | 7.1784μs | 139.3072 KOps/s | 139.1350 KOps/s | |
test_contiguous[memmap_tensor0] | 20.8190μs | 1.8797μs | 531.9881 KOps/s | 525.3443 KOps/s | |
test_stack[memmap_tensor0] | 76.4230μs | 5.5557μs | 179.9966 KOps/s | 168.0946 KOps/s | |
test_memmaptd_index | 1.0752ms | 0.4161ms | 2.4031 KOps/s | 2.4335 KOps/s | |
test_memmaptd_index_astensor | 0.7624ms | 0.5195ms | 1.9250 KOps/s | 1.9382 KOps/s | |
test_memmaptd_index_op | 1.6690ms | 1.0853ms | 921.4402 Ops/s | 980.2310 Ops/s | |
test_serialize_model | 0.2284s | 0.1365s | 7.3275 Ops/s | 8.5512 Ops/s | |
test_serialize_model_pickle | 0.4952s | 0.4122s | 2.4260 Ops/s | 2.5135 Ops/s | |
test_serialize_weights | 0.1350s | 0.1201s | 8.3269 Ops/s | 7.6515 Ops/s | |
test_serialize_weights_returnearly | 0.1681s | 0.1609s | 6.2166 Ops/s | 6.2039 Ops/s | |
test_serialize_weights_pickle | 1.2053s | 0.7490s | 1.3352 Ops/s | 2.5687 Ops/s | |
test_serialize_weights_filesystem | 0.1518s | 0.1417s | 7.0585 Ops/s | 6.9141 Ops/s | |
test_serialize_model_filesystem | 0.1466s | 0.1433s | 6.9769 Ops/s | 6.7165 Ops/s | |
test_reshape_pytree | 86.0510μs | 39.4214μs | 25.3669 KOps/s | 25.2300 KOps/s | |
test_reshape_td | 96.7900μs | 45.1304μs | 22.1580 KOps/s | 21.1135 KOps/s | |
test_view_pytree | 83.6360μs | 38.9157μs | 25.6965 KOps/s | 25.2705 KOps/s | |
test_view_td | 0.1362ms | 51.8381μs | 19.2908 KOps/s | 19.3033 KOps/s | |
test_unbind_pytree | 80.9420μs | 36.2139μs | 27.6137 KOps/s | 27.1661 KOps/s | |
test_unbind_td | 0.2964ms | 45.7174μs | 21.8735 KOps/s | 17.9039 KOps/s | |
test_split_pytree | 84.1370μs | 38.7480μs | 25.8078 KOps/s | 25.5755 KOps/s | |
test_split_td | 0.4570ms | 57.2848μs | 17.4567 KOps/s | 17.3646 KOps/s | |
test_add_pytree | 0.1182ms | 45.1462μs | 22.1503 KOps/s | 22.1037 KOps/s | |
test_add_td | 0.1885ms | 89.5225μs | 11.1704 KOps/s | 12.2198 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1670ms | 74.0593μs | 13.5027 KOps/s | 13.6457 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3730ms | 0.2045ms | 4.8901 KOps/s | 4.9590 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1325ms | 56.3652μs | 17.7415 KOps/s | 18.2474 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2889ms | 0.1477ms | 6.7690 KOps/s | 6.8350 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 71.0830μs | 27.6165μs | 36.2102 KOps/s | 36.9005 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1473ms | 79.0879μs | 12.6442 KOps/s | 12.7034 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1790ms | 78.6865μs | 12.7087 KOps/s | 12.6556 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1389ms | 67.0979μs | 14.9036 KOps/s | 14.5875 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2579ms | 0.1244ms | 8.0392 KOps/s | 7.9395 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4298ms | 0.2507ms | 3.9892 KOps/s | 3.9946 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1451ms | 55.5855μs | 17.9903 KOps/s | 17.9345 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1751ms | 78.6919μs | 12.7078 KOps/s | 12.2309 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1857ms | 0.1153ms | 8.6703 KOps/s | 8.7934 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4852ms | 0.3061ms | 3.2674 KOps/s | 3.2565 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6363ms | 0.2836ms | 3.5267 KOps/s | 3.5580 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2654ms | 0.1235ms | 8.0954 KOps/s | 8.0827 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2412ms | 73.0318μs | 13.6927 KOps/s | 13.3475 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1387ms | 56.3820μs | 17.7362 KOps/s | 17.8080 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4815ms | 0.2465ms | 4.0560 KOps/s | 4.0527 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2430ms | 0.1118ms | 8.9471 KOps/s | 8.7805 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 83.1630μs | 29.4855μs | 33.9150 KOps/s | 33.2045 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1684ms | 83.2902μs | 12.0062 KOps/s | 12.4019 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1635ms | 80.8955μs | 12.3616 KOps/s | 12.1202 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1323ms | 69.5802μs | 14.3719 KOps/s | 14.4382 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4261ms | 0.2190ms | 4.5662 KOps/s | 4.5498 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.2834ms | 1.8540ms | 539.3653 Ops/s | 554.9571 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3115ms | 0.2126ms | 4.7034 KOps/s | 4.6399 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.6219ms | 1.1942ms | 837.3861 Ops/s | 851.5210 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.6542ms | 0.4622ms | 2.1637 KOps/s | 2.1272 KOps/s | |
test_compile_assign_and_add_stack[eager] | 6.8347ms | 4.4624ms | 224.0948 Ops/s | 243.0333 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 95.9590μs | 43.1538μs | 23.1729 KOps/s | 22.6740 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.4874ms | 50.2664μs | 19.8940 KOps/s | 20.0842 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 95.5590μs | 38.5446μs | 25.9440 KOps/s | 26.3800 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 97.0840μs | 29.5073μs | 33.8899 KOps/s | 34.0102 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 85.6000μs | 37.9790μs | 26.3304 KOps/s | 26.0137 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 97.9310μs | 29.9450μs | 33.3946 KOps/s | 33.3703 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1505ms | 78.5056μs | 12.7379 KOps/s | 12.9107 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5151ms | 28.2646μs | 35.3799 KOps/s | 34.7535 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1674ms | 72.0020μs | 13.8885 KOps/s | 14.0409 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 61.2250μs | 23.3824μs | 42.7672 KOps/s | 41.6669 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1401ms | 71.2528μs | 14.0345 KOps/s | 13.9307 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.2187ms | 23.5380μs | 42.4845 KOps/s | 42.0149 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1697ms | 77.3052μs | 12.9357 KOps/s | 12.6108 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5191ms | 28.5266μs | 35.0550 KOps/s | 34.8049 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1447ms | 71.5766μs | 13.9710 KOps/s | 13.9217 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 65.0220μs | 23.2405μs | 43.0283 KOps/s | 41.9513 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1362ms | 71.6095μs | 13.9646 KOps/s | 14.0101 KOps/s | |
test_compile_indexing[int-pytree-eager] | 66.5750μs | 23.3639μs | 42.8011 KOps/s | 41.7549 KOps/s | |
test_mod_add[eager] | 82.6450μs | 26.9656μs | 37.0843 KOps/s | 38.8420 KOps/s | |
test_mod_add[compile] | 0.1043ms | 44.1374μs | 22.6565 KOps/s | 21.9624 KOps/s | |
test_mod_add[compile-overhead] | 0.1001ms | 44.0977μs | 22.6769 KOps/s | 21.7096 KOps/s | |
test_mod_wrap[eager] | 0.3789ms | 0.2210ms | 4.5242 KOps/s | 4.6097 KOps/s | |
test_mod_wrap[compile] | 2.0044ms | 0.2039ms | 4.9048 KOps/s | 4.7619 KOps/s | |
test_mod_wrap[compile-overhead] | 1.8991ms | 0.2029ms | 4.9280 KOps/s | 4.8323 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.6530ms | 10.8091ms | 92.5143 Ops/s | 86.3792 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.0138ms | 10.6926ms | 93.5224 Ops/s | 81.2584 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.2951ms | 10.8196ms | 92.4246 Ops/s | 76.5922 Ops/s | |
test_seq_add[eager] | 0.2208ms | 96.4125μs | 10.3721 KOps/s | 10.9276 KOps/s | |
test_seq_add[compile] | 0.1258ms | 59.6282μs | 16.7706 KOps/s | 16.7721 KOps/s | |
test_seq_add[compile-overhead] | 0.1273ms | 58.5751μs | 17.0721 KOps/s | 17.0934 KOps/s | |
test_seq_wrap[eager] | 0.6733ms | 0.4056ms | 2.4656 KOps/s | 2.5685 KOps/s | |
test_seq_wrap[compile] | 0.4500ms | 0.2278ms | 4.3907 KOps/s | 4.3360 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3218ms | 0.2245ms | 4.4541 KOps/s | 4.3882 KOps/s | |
test_func_call_runtime[False-eager] | 1.0335ms | 0.5572ms | 1.7945 KOps/s | 1.7703 KOps/s | |
test_func_call_runtime[False-compile] | 1.6083ms | 0.4364ms | 2.2917 KOps/s | 2.2168 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5181ms | 0.4273ms | 2.3402 KOps/s | 2.3017 KOps/s | |
test_func_call_runtime[True-eager] | 0.9398ms | 0.7651ms | 1.3069 KOps/s | 1.2767 KOps/s | |
test_func_call_runtime[True-compile] | 0.9841ms | 0.4683ms | 2.1353 KOps/s | 2.1164 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5606ms | 0.4656ms | 2.1478 KOps/s | 2.1283 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.6531ms | 0.5519ms | 1.8118 KOps/s | 1.7889 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5973ms | 0.4251ms | 2.3525 KOps/s | 2.3069 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5574ms | 0.4267ms | 2.3435 KOps/s | 2.3058 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2816ms | 0.9289ms | 1.0766 KOps/s | 1.0737 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8362ms | 0.4913ms | 2.0352 KOps/s | 1.9983 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6731ms | 0.4881ms | 2.0488 KOps/s | 2.0069 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7054ms | 1.9270ms | 518.9489 Ops/s | 514.1575 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.6945ms | 0.5188ms | 1.9274 KOps/s | 1.8844 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7512ms | 0.5185ms | 1.9286 KOps/s | 1.8798 KOps/s | |
test_distributed | 0.2817ms | 0.1269ms | 7.8825 KOps/s | 7.7541 KOps/s | |
test_tdmodule | 0.1333ms | 19.1678μs | 52.1707 KOps/s | 57.2040 KOps/s | |
test_tdmodule_dispatch | 63.9600μs | 38.2920μs | 26.1151 KOps/s | 29.6232 KOps/s | |
test_tdseq | 37.3000μs | 21.8815μs | 45.7008 KOps/s | 50.6235 KOps/s | |
test_tdseq_dispatch | 61.2040μs | 44.2989μs | 22.5739 KOps/s | 25.8461 KOps/s | |
test_instantiation_functorch | 1.7388ms | 1.5368ms | 650.6957 Ops/s | 649.5806 Ops/s | |
test_exec_functorch | 0.3066ms | 0.1810ms | 5.5249 KOps/s | 5.4792 KOps/s | |
test_exec_functional_call | 0.4183ms | 0.1726ms | 5.7925 KOps/s | 5.4282 KOps/s | |
test_exec_td_decorator | 0.4569ms | 0.2337ms | 4.2786 KOps/s | 4.1568 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9507ms | 0.6709ms | 1.4904 KOps/s | 1.5163 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9223ms | 0.6663ms | 1.5007 KOps/s | 1.5174 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9445ms | 0.5425ms | 1.8434 KOps/s | 1.8037 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7868ms | 0.5432ms | 1.8410 KOps/s | 1.8328 KOps/s | |
test_to_module_speed[True] | 1.9633ms | 1.3790ms | 725.1759 Ops/s | 711.8873 Ops/s | |
test_to_module_speed[False] | 1.9415ms | 1.3399ms | 746.3096 Ops/s | 734.9262 Ops/s | |
test_tc_init | 0.1000ms | 48.3553μs | 20.6803 KOps/s | 22.2539 KOps/s | |
test_tc_init_nested | 0.1805ms | 95.9814μs | 10.4187 KOps/s | 11.2295 KOps/s | |
test_tc_first_layer_tensor | 0.1355ms | 1.6835μs | 593.9870 KOps/s | 651.4422 KOps/s | |
test_tc_first_layer_nontensor | 25.1070μs | 4.9033μs | 203.9450 KOps/s | 211.0926 KOps/s | |
test_tc_second_layer_tensor | 16.8820μs | 2.8661μs | 348.9010 KOps/s | 351.9165 KOps/s | |
test_tc_second_layer_nontensor | 24.1650μs | 6.3255μs | 158.0894 KOps/s | 166.8935 KOps/s | |
test_unbind | 0.2231s | 12.6952ms | 78.7701 Ops/s | 78.8189 Ops/s | |
test_full_like | 9.8500ms | 7.4594ms | 134.0592 Ops/s | 121.8992 Ops/s | |
test_zeros_like | 3.3326ms | 2.8501ms | 350.8639 Ops/s | 320.2936 Ops/s | |
test_ones_like | 3.7393ms | 3.3044ms | 302.6235 Ops/s | 263.9422 Ops/s | |
test_clone | 7.3667ms | 5.1677ms | 193.5111 Ops/s | 159.1111 Ops/s | |
test_squeeze | 85.3290μs | 11.7347μs | 85.2170 KOps/s | 78.9099 KOps/s | |
test_unsqueeze | 0.1443ms | 88.0167μs | 11.3615 KOps/s | 10.5272 KOps/s | |
test_split | 0.3640ms | 0.1927ms | 5.1903 KOps/s | 5.0549 KOps/s | |
test_permute | 0.4743ms | 0.2178ms | 4.5912 KOps/s | 4.4366 KOps/s | |
test_stack | 34.1326ms | 25.9528ms | 38.5314 Ops/s | 36.2444 Ops/s | |
test_cat | 31.0961ms | 25.0816ms | 39.8699 Ops/s | 36.5579 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1637ms | 17.6898μs | 56.5297 KOps/s | 60.8854 KOps/s | |
test_plain_set_stack_nested | 41.1900μs | 17.7776μs | 56.2507 KOps/s | 60.6439 KOps/s | |
test_plain_set_nested_inplace | 74.6210μs | 18.9007μs | 52.9082 KOps/s | 56.4611 KOps/s | |
test_plain_set_stack_nested_inplace | 72.2620μs | 18.6300μs | 53.6770 KOps/s | 56.6233 KOps/s | |
test_items | 25.4100μs | 2.9002μs | 344.8089 KOps/s | 345.3927 KOps/s | |
test_items_nested | 0.3830ms | 0.3362ms | 2.9744 KOps/s | 3.0132 KOps/s | |
test_items_nested_locked | 0.3929ms | 0.3372ms | 2.9655 KOps/s | 2.9965 KOps/s | |
test_items_nested_leaf | 88.8120μs | 62.7103μs | 15.9463 KOps/s | 16.0657 KOps/s | |
test_items_stack_nested | 0.4755ms | 0.3379ms | 2.9592 KOps/s | 2.9734 KOps/s | |
test_items_stack_nested_leaf | 0.2359ms | 62.7428μs | 15.9381 KOps/s | 15.4450 KOps/s | |
test_items_stack_nested_locked | 0.4583ms | 0.3375ms | 2.9629 KOps/s | 2.9494 KOps/s | |
test_keys | 32.2600μs | 3.4424μs | 290.4913 KOps/s | 288.0896 KOps/s | |
test_keys_nested | 0.1136ms | 70.7060μs | 14.1431 KOps/s | 14.1464 KOps/s | |
test_keys_nested_locked | 0.6791ms | 77.4147μs | 12.9174 KOps/s | 12.9019 KOps/s | |
test_keys_nested_leaf | 97.5510μs | 61.4295μs | 16.2788 KOps/s | 16.3025 KOps/s | |
test_keys_stack_nested | 0.1172ms | 71.1987μs | 14.0452 KOps/s | 14.0962 KOps/s | |
test_keys_stack_nested_leaf | 0.1021ms | 61.5193μs | 16.2551 KOps/s | 15.9433 KOps/s | |
test_keys_stack_nested_locked | 0.1218ms | 76.7747μs | 13.0251 KOps/s | 12.7952 KOps/s | |
test_values | 5.5500μs | 0.8404μs | 1.1899 MOps/s | 1.1974 MOps/s | |
test_values_nested | 79.3710μs | 48.6327μs | 20.5623 KOps/s | 20.3869 KOps/s | |
test_values_nested_locked | 79.8720μs | 50.0995μs | 19.9603 KOps/s | 19.7767 KOps/s | |
test_values_nested_leaf | 80.9710μs | 42.5270μs | 23.5145 KOps/s | 23.4246 KOps/s | |
test_values_stack_nested | 0.2152ms | 49.1564μs | 20.3432 KOps/s | 19.9649 KOps/s | |
test_values_stack_nested_leaf | 70.6510μs | 43.4242μs | 23.0287 KOps/s | 22.6811 KOps/s | |
test_values_stack_nested_locked | 95.1420μs | 50.8830μs | 19.6529 KOps/s | 19.2751 KOps/s | |
test_membership | 2.6855μs | 0.4991μs | 2.0038 MOps/s | 1.9671 MOps/s | |
test_membership_nested | 98.8365μs | 1.8133μs | 551.4778 KOps/s | 554.0662 KOps/s | |
test_membership_nested_leaf | 9.6770μs | 1.7790μs | 562.1261 KOps/s | 566.3554 KOps/s | |
test_membership_stacked_nested | 35.7010μs | 1.8581μs | 538.1740 KOps/s | 535.7654 KOps/s | |
test_membership_stacked_nested_leaf | 27.0910μs | 1.8915μs | 528.6776 KOps/s | 536.9076 KOps/s | |
test_membership_nested_last | 35.1910μs | 2.9664μs | 337.1056 KOps/s | 343.8391 KOps/s | |
test_membership_nested_leaf_last | 41.6600μs | 2.9382μs | 340.3445 KOps/s | 342.1291 KOps/s | |
test_membership_stacked_nested_last | 25.2210μs | 2.9568μs | 338.2086 KOps/s | 340.1810 KOps/s | |
test_membership_stacked_nested_leaf_last | 31.3010μs | 2.9370μs | 340.4805 KOps/s | 344.3436 KOps/s | |
test_nested_getleaf | 34.7410μs | 6.0446μs | 165.4368 KOps/s | 165.4896 KOps/s | |
test_nested_get | 43.3610μs | 5.7137μs | 175.0179 KOps/s | 175.5579 KOps/s | |
test_stacked_getleaf | 27.4800μs | 6.0943μs | 164.0866 KOps/s | 166.4321 KOps/s | |
test_stacked_get | 31.1300μs | 5.6349μs | 177.4653 KOps/s | 174.7578 KOps/s | |
test_nested_getitemleaf | 26.3510μs | 6.1091μs | 163.6899 KOps/s | 164.1319 KOps/s | |
test_nested_getitem | 30.0610μs | 5.7284μs | 174.5675 KOps/s | 175.2622 KOps/s | |
test_stacked_getitemleaf | 28.7700μs | 6.1172μs | 163.4731 KOps/s | 165.4567 KOps/s | |
test_stacked_getitem | 31.0200μs | 5.7329μs | 174.4320 KOps/s | 175.8449 KOps/s | |
test_lock_nested | 7.6156ms | 0.4301ms | 2.3250 KOps/s | 2.3481 KOps/s | |
test_lock_stack_nested | 0.5093ms | 0.3881ms | 2.5764 KOps/s | 2.6016 KOps/s | |
test_unlock_nested | 0.7892ms | 0.3627ms | 2.7574 KOps/s | 2.7802 KOps/s | |
test_unlock_stack_nested | 0.4696ms | 0.3255ms | 3.0726 KOps/s | 3.1297 KOps/s | |
test_flatten_speed | 0.2134ms | 76.7503μs | 13.0293 KOps/s | 13.0379 KOps/s | |
test_unflatten_speed | 0.4091ms | 0.3202ms | 3.1231 KOps/s | 3.1741 KOps/s | |
test_common_ops | 1.6575ms | 1.2896ms | 775.4609 Ops/s | 806.6401 Ops/s | |
test_creation | 37.9810μs | 1.4510μs | 689.1667 KOps/s | 695.7918 KOps/s | |
test_creation_empty | 59.4820μs | 17.5425μs | 57.0046 KOps/s | 67.2994 KOps/s | |
test_creation_nested_1 | 46.4910μs | 19.1627μs | 52.1848 KOps/s | 58.7267 KOps/s | |
test_creation_nested_2 | 57.7610μs | 21.7878μs | 45.8972 KOps/s | 50.7321 KOps/s | |
test_clone | 0.1790ms | 28.5346μs | 35.0452 KOps/s | 33.8446 KOps/s | |
test_getitem[int] | 1.1880ms | 15.1418μs | 66.0424 KOps/s | 65.7709 KOps/s | |
test_getitem[slice_int] | 0.1244ms | 26.2482μs | 38.0979 KOps/s | 37.6833 KOps/s | |
test_getitem[range] | 0.2317ms | 0.1094ms | 9.1394 KOps/s | 8.9357 KOps/s | |
test_getitem[tuple] | 0.1504ms | 22.9572μs | 43.5594 KOps/s | 43.5604 KOps/s | |
test_getitem[list] | 0.2568ms | 99.4382μs | 10.0565 KOps/s | 10.1376 KOps/s | |
test_setitem_dim[int] | 67.6010μs | 43.5080μs | 22.9843 KOps/s | 22.9112 KOps/s | |
test_setitem_dim[slice_int] | 88.5810μs | 65.7914μs | 15.1995 KOps/s | 15.2788 KOps/s | |
test_setitem_dim[range] | 0.2617ms | 0.1268ms | 7.8836 KOps/s | 7.9355 KOps/s | |
test_setitem_dim[tuple] | 0.2134ms | 59.5028μs | 16.8059 KOps/s | 17.0563 KOps/s | |
test_setitem | 0.1990ms | 42.9986μs | 23.2566 KOps/s | 23.8764 KOps/s | |
test_set | 0.1937ms | 41.8168μs | 23.9138 KOps/s | 24.7804 KOps/s | |
test_set_shared | 0.3144ms | 53.0130μs | 18.8633 KOps/s | 18.5428 KOps/s | |
test_update | 0.2003ms | 52.5728μs | 19.0213 KOps/s | 20.0844 KOps/s | |
test_update_nested | 0.2224ms | 59.3081μs | 16.8611 KOps/s | 17.3795 KOps/s | |
test_update__nested | 0.1472ms | 62.0819μs | 16.1077 KOps/s | 15.9844 KOps/s | |
test_set_nested | 0.2638ms | 47.3830μs | 21.1046 KOps/s | 22.9837 KOps/s | |
test_set_nested_new | 0.2358ms | 50.7539μs | 19.7029 KOps/s | 21.3925 KOps/s | |
test_select | 0.2360ms | 64.6775μs | 15.4613 KOps/s | 16.7006 KOps/s | |
test_select_nested | 0.4869ms | 41.7893μs | 23.9296 KOps/s | 24.4527 KOps/s | |
test_exclude_nested | 0.1084ms | 59.0990μs | 16.9208 KOps/s | 16.9253 KOps/s | |
test_empty[True] | 0.3206ms | 0.2557ms | 3.9109 KOps/s | 3.9106 KOps/s | |
test_empty[False] | 17.7793μs | 0.7281μs | 1.3735 MOps/s | 1.3453 MOps/s | |
test_to | 0.1201ms | 26.3594μs | 37.9372 KOps/s | 37.8977 KOps/s | |
test_to_nonblocking | 69.0610μs | 25.1185μs | 39.8113 KOps/s | 38.8857 KOps/s | |
test_unbind_speed | 0.4586ms | 0.2757ms | 3.6274 KOps/s | 3.6899 KOps/s | |
test_unbind_speed_stack0 | 0.3749ms | 0.2712ms | 3.6871 KOps/s | 3.6753 KOps/s | |
test_unbind_speed_stack1 | 92.5780ms | 0.7126ms | 1.4034 KOps/s | 1.4270 KOps/s | |
test_split | 92.8875ms | 2.1325ms | 468.9259 Ops/s | 472.3447 Ops/s | |
test_chunk | 96.3681ms | 2.2046ms | 453.6065 Ops/s | 469.5512 Ops/s | |
test_to[False] | 3.7588ms | 3.4006ms | 294.0681 Ops/s | 297.8240 Ops/s | |
test_to[True] | 4.7324ms | 4.4035ms | 227.0909 Ops/s | 228.3179 Ops/s | |
test_to_njt[False] | 0.3279s | 0.2509s | 3.9850 Ops/s | 4.0666 Ops/s | |
test_to_njt[True] | 0.3582s | 0.2778s | 3.5994 Ops/s | 3.6194 Ops/s | |
test_creation[device0] | 0.4342ms | 0.1258ms | 7.9477 KOps/s | 7.9421 KOps/s | |
test_creation_from_tensor | 0.4193ms | 0.1293ms | 7.7346 KOps/s | 7.7984 KOps/s | |
test_add_one[memmap_tensor0] | 0.1333ms | 8.8446μs | 113.0634 KOps/s | 112.8391 KOps/s | |
test_contiguous[memmap_tensor0] | 80.9310μs | 2.1173μs | 472.3065 KOps/s | 465.9531 KOps/s | |
test_stack[memmap_tensor0] | 39.2110μs | 6.4629μs | 154.7298 KOps/s | 156.9667 KOps/s | |
test_memmaptd_index | 1.1843ms | 0.4239ms | 2.3589 KOps/s | 2.3747 KOps/s | |
test_memmaptd_index_astensor | 0.7618ms | 0.4949ms | 2.0206 KOps/s | 2.0282 KOps/s | |
test_memmaptd_index_op | 1.5434ms | 1.0883ms | 918.8731 Ops/s | 974.8152 Ops/s | |
test_serialize_model | 0.1319s | 0.1305s | 7.6600 Ops/s | 7.6728 Ops/s | |
test_serialize_model_pickle | 1.3490s | 1.1876s | 0.8420 Ops/s | 0.8402 Ops/s | |
test_serialize_weights | 0.1311s | 0.1303s | 7.6773 Ops/s | 7.7015 Ops/s | |
test_serialize_weights_returnearly | 0.2222s | 56.9991ms | 17.5441 Ops/s | 15.7963 Ops/s | |
test_serialize_weights_pickle | 1.3527s | 1.2129s | 0.8245 Ops/s | 0.8324 Ops/s | |
test_reshape_pytree | 0.1630ms | 34.7005μs | 28.8181 KOps/s | 28.6077 KOps/s | |
test_reshape_td | 89.9520μs | 41.2361μs | 24.2506 KOps/s | 24.3376 KOps/s | |
test_view_pytree | 0.1332ms | 34.2682μs | 29.1815 KOps/s | 28.7949 KOps/s | |
test_view_td | 87.7310μs | 43.9431μs | 22.7567 KOps/s | 21.3141 KOps/s | |
test_unbind_pytree | 0.1763ms | 33.8412μs | 29.5498 KOps/s | 29.5145 KOps/s | |
test_unbind_td | 0.4257ms | 42.4783μs | 23.5414 KOps/s | 23.6906 KOps/s | |
test_split_pytree | 0.5942ms | 45.8136μs | 21.8276 KOps/s | 21.9421 KOps/s | |
test_split_td | 93.6640ms | 63.4429μs | 15.7622 KOps/s | 18.5020 KOps/s | |
test_add_pytree | 0.2027ms | 56.5744μs | 17.6758 KOps/s | 17.8620 KOps/s | |
test_add_td | 0.2549ms | 96.0126μs | 10.4153 KOps/s | 10.9206 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.3089ms | 0.1584ms | 6.3149 KOps/s | 6.2159 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3055ms | 0.1599ms | 6.2552 KOps/s | 6.2453 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.3021ms | 0.1531ms | 6.5312 KOps/s | 6.4461 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3286ms | 0.1831ms | 5.4629 KOps/s | 5.4490 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1590ms | 21.7880μs | 45.8968 KOps/s | 46.0674 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1405ms | 48.3793μs | 20.6700 KOps/s | 20.9113 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2444ms | 62.6911μs | 15.9512 KOps/s | 15.6860 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1931ms | 49.1400μs | 20.3500 KOps/s | 20.1906 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4560ms | 0.3107ms | 3.2187 KOps/s | 3.2177 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3686ms | 0.2318ms | 4.3145 KOps/s | 4.3578 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2735ms | 0.1256ms | 7.9614 KOps/s | 7.9741 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2244ms | 64.7880μs | 15.4350 KOps/s | 15.4918 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4668ms | 0.3187ms | 3.1381 KOps/s | 3.1272 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7660ms | 0.6156ms | 1.6244 KOps/s | 1.6172 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4677ms | 0.2827ms | 3.5368 KOps/s | 3.5783 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4349ms | 0.3112ms | 3.2133 KOps/s | 3.1087 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2189ms | 75.7743μs | 13.1971 KOps/s | 12.1621 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2777ms | 0.1269ms | 7.8779 KOps/s | 7.3264 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6834ms | 0.5253ms | 1.9037 KOps/s | 1.8303 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4645ms | 0.3196ms | 3.1290 KOps/s | 3.0342 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1334ms | 19.4298μs | 51.4674 KOps/s | 47.2139 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 77.3610μs | 38.0273μs | 26.2969 KOps/s | 26.1093 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1261ms | 70.1637μs | 14.2524 KOps/s | 14.1189 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1638ms | 51.3579μs | 19.4712 KOps/s | 19.0907 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3077ms | 0.8008ms | 1.2488 KOps/s | 1.1362 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.5190ms | 3.2187ms | 310.6821 Ops/s | 303.0638 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3629ms | 0.8207ms | 1.2185 KOps/s | 1.1207 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.3842ms | 3.2041ms | 312.1005 Ops/s | 312.7702 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2679ms | 0.1175ms | 8.5113 KOps/s | 8.1763 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2128ms | 59.1716μs | 16.9000 KOps/s | 15.7709 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2381ms | 0.1124ms | 8.8956 KOps/s | 8.5857 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2299ms | 45.3815μs | 22.0354 KOps/s | 21.3154 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2956ms | 0.1180ms | 8.4775 KOps/s | 8.3857 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2290ms | 45.4372μs | 22.0084 KOps/s | 21.5222 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2909ms | 0.1435ms | 6.9698 KOps/s | 6.9841 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1484ms | 23.9523μs | 41.7497 KOps/s | 41.1004 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2632ms | 0.1375ms | 7.2739 KOps/s | 7.2893 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1126ms | 20.0399μs | 49.9006 KOps/s | 49.2196 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2896ms | 0.1391ms | 7.1901 KOps/s | 7.2442 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1339ms | 19.8531μs | 50.3699 KOps/s | 50.8165 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2968ms | 0.1444ms | 6.9247 KOps/s | 6.9152 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4731ms | 23.7258μs | 42.1482 KOps/s | 41.2025 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2858ms | 0.1392ms | 7.1856 KOps/s | 7.2508 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1361ms | 19.7498μs | 50.6334 KOps/s | 49.7908 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2889ms | 0.1386ms | 7.2170 KOps/s | 7.2412 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1263ms | 19.9155μs | 50.2121 KOps/s | 50.3694 KOps/s | |
test_mod_add[eager] | 0.1924ms | 33.1722μs | 30.1457 KOps/s | 31.0852 KOps/s | |
test_mod_add[compile] | 0.3152ms | 78.7450μs | 12.6992 KOps/s | 12.1245 KOps/s | |
test_mod_add[compile-overhead] | 0.2997ms | 0.1489ms | 6.7144 KOps/s | 6.2073 KOps/s | |
test_mod_wrap[eager] | 0.3900ms | 0.2407ms | 4.1548 KOps/s | 3.9445 KOps/s | |
test_mod_wrap[compile] | 1.4589ms | 0.3075ms | 3.2522 KOps/s | 3.3941 KOps/s | |
test_mod_wrap[compile-overhead] | 7.4154ms | 4.0179ms | 248.8889 Ops/s | 244.6695 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5238ms | 1.3514ms | 739.9906 Ops/s | 689.6074 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.7413ms | 1.3333ms | 750.0291 Ops/s | 697.1200 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3158ms | 0.9010ms | 1.1099 KOps/s | 990.1441 Ops/s | |
test_seq_add[eager] | 0.5009ms | 98.6426μs | 10.1376 KOps/s | 10.3022 KOps/s | |
test_seq_add[compile] | 0.5100ms | 88.3195μs | 11.3225 KOps/s | 10.8011 KOps/s | |
test_seq_add[compile-overhead] | 0.3010ms | 0.1230ms | 8.1282 KOps/s | 8.1033 KOps/s | |
test_seq_wrap[eager] | 0.5825ms | 0.3919ms | 2.5516 KOps/s | 2.5484 KOps/s | |
test_seq_wrap[compile] | 0.5186ms | 0.3261ms | 3.0663 KOps/s | 3.0702 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3798ms | 0.2267ms | 4.4118 KOps/s | 4.5869 KOps/s | |
test_func_call_runtime[False-eager] | 0.9846ms | 0.7748ms | 1.2906 KOps/s | 1.3809 KOps/s | |
test_func_call_runtime[False-compile] | 1.0079ms | 0.8263ms | 1.2102 KOps/s | 1.2731 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5135ms | 0.3626ms | 2.7579 KOps/s | 2.7789 KOps/s | |
test_func_call_runtime[True-eager] | 1.2717ms | 0.8969ms | 1.1149 KOps/s | 1.1260 KOps/s | |
test_func_call_runtime[True-compile] | 1.0014ms | 0.7999ms | 1.2502 KOps/s | 1.2375 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5433ms | 0.3748ms | 2.6681 KOps/s | 2.6576 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8617ms | 0.7188ms | 1.3912 KOps/s | 1.3775 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1617ms | 0.7852ms | 1.2736 KOps/s | 1.2715 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7438ms | 0.3539ms | 2.8260 KOps/s | 2.8171 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.3989ms | 0.9959ms | 1.0042 KOps/s | 991.2650 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.2312ms | 0.8275ms | 1.2085 KOps/s | 1.1987 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5316ms | 0.3961ms | 2.5248 KOps/s | 2.4963 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5386ms | 2.0997ms | 476.2654 Ops/s | 473.0720 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.2400ms | 0.8436ms | 1.1853 KOps/s | 1.1838 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.8013ms | 0.4049ms | 2.4700 KOps/s | 2.4781 KOps/s | |
test_distributed | 4.4992ms | 0.2322ms | 4.3070 KOps/s | 8.8372 KOps/s | |
test_tdmodule | 28.1700μs | 15.2932μs | 65.3884 KOps/s | 69.8428 KOps/s | |
test_tdmodule_dispatch | 50.6710μs | 30.2274μs | 33.0825 KOps/s | 36.4342 KOps/s | |
test_tdseq | 35.2710μs | 16.5614μs | 60.3814 KOps/s | 65.0613 KOps/s | |
test_tdseq_dispatch | 62.1110μs | 33.2258μs | 30.0971 KOps/s | 33.1031 KOps/s | |
test_instantiation_functorch | 2.2336ms | 1.8225ms | 548.6827 Ops/s | 543.8202 Ops/s | |
test_exec_functorch | 0.3309ms | 0.2043ms | 4.8947 KOps/s | 4.8317 KOps/s | |
test_exec_functional_call | 0.6012ms | 0.2047ms | 4.8843 KOps/s | 4.8198 KOps/s | |
test_exec_td_decorator | 0.6779ms | 0.2563ms | 3.9020 KOps/s | 3.8390 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0621ms | 0.6836ms | 1.4628 KOps/s | 1.4680 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1051ms | 0.6862ms | 1.4572 KOps/s | 1.4403 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7516ms | 0.5996ms | 1.6678 KOps/s | 1.6434 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0052ms | 0.6001ms | 1.6665 KOps/s | 1.6015 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.2125ms | 19.7356ms | 50.6698 Ops/s | 51.1531 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.1049ms | 19.7525ms | 50.6264 Ops/s | 51.2101 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.9612ms | 19.6167ms | 50.9770 Ops/s | 51.6202 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.7731ms | 19.6319ms | 50.9374 Ops/s | 51.3444 Ops/s | |
test_to_module_speed[True] | 1.4206ms | 0.9893ms | 1.0109 KOps/s | 1.0187 KOps/s | |
test_to_module_speed[False] | 1.3863ms | 0.9720ms | 1.0288 KOps/s | 1.0126 KOps/s | |
test_tc_init | 62.4410μs | 39.4328μs | 25.3596 KOps/s | 29.1692 KOps/s | |
test_tc_init_nested | 0.4629ms | 79.2509μs | 12.6182 KOps/s | 13.9666 KOps/s | |
test_tc_first_layer_tensor | 53.9066μs | 0.6744μs | 1.4827 MOps/s | 1.4884 MOps/s | |
test_tc_first_layer_nontensor | 23.2100μs | 2.1946μs | 455.6650 KOps/s | 456.4518 KOps/s | |
test_tc_second_layer_tensor | 95.5618μs | 1.3520μs | 739.6205 KOps/s | 732.3245 KOps/s | |
test_tc_second_layer_nontensor | 30.9800μs | 2.8984μs | 345.0211 KOps/s | 345.0787 KOps/s | |
test_unbind | 0.1941s | 9.5396ms | 104.8259 Ops/s | 92.9299 Ops/s | |
test_full_like | 0.7717ms | 0.5737ms | 1.7431 KOps/s | 1.7434 KOps/s | |
test_zeros_like | 0.3322ms | 0.1983ms | 5.0431 KOps/s | 5.0487 KOps/s | |
test_ones_like | 0.5528ms | 0.1980ms | 5.0505 KOps/s | 5.0471 KOps/s | |
test_clone | 0.7770ms | 0.4151ms | 2.4090 KOps/s | 2.4099 KOps/s | |
test_squeeze | 0.3784ms | 9.5273μs | 104.9615 KOps/s | 105.6544 KOps/s | |
test_unsqueeze | 0.2716ms | 72.2251μs | 13.8456 KOps/s | 13.8902 KOps/s | |
test_split | 0.5278ms | 0.1545ms | 6.4740 KOps/s | 6.5180 KOps/s | |
test_permute | 0.2349ms | 0.1768ms | 5.6562 KOps/s | 5.7346 KOps/s | |
test_stack | 1.3477ms | 0.8700ms | 1.1494 KOps/s | 1.1459 KOps/s | |
test_cat | 1.3843ms | 1.2317ms | 811.8998 Ops/s | 811.8645 Ops/s |
Co-authored-by: Shagun Sodhani <[email protected]>
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 25f5d9a2802f68755da022607d2f937cd89d8eed Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: ddd498c547f0f1ff8aee01d0990061bfff5502eb Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 0f016257c2c3ad24f71bb8b6340b150d631f339d Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: c0f61164cb144e5bbc750a697a3920dfce461dc9 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 317a4b9ae62ab8722e2c62157f2045bc7fa293b7 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 17, 2024
ghstack-source-id: c69db5f76f5a73f87ff67c949b5fba32ce6cdffd Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 17, 2024
ghstack-source-id: e6b2c507cbe38357d56ab71457be43ecc53ee57f Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 17, 2024
ghstack-source-id: b924c0d94db3e1b59f48b9fa22b98f4cfe89d6b9 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 17, 2024
ghstack-source-id: 6e1adc6fb6cea6a5759d1bd6d81939ef91012dd7 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 17, 2024
ghstack-source-id: bb7f342eee12ef61224c32aa0df6e93efaa1b117 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 17, 2024
ghstack-source-id: 42cfea2f3c838fb020932ab0439795a8f8d55354 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: f511f9113302c6bc9cc5602db5aab08157e0b52a Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: 36af01f9a65c27d21d7478748132780b86bd2983 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: dd7514e601b26611cec0859144a874ca67018437 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: 88eadcc7364bbbcb22180d3906c71fcc005b9908 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: 8162b55c624471f1e4182b62382edd8a2a74afab Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: dea67cba6a5e6b33a22a225cf48bd35afb439fac Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: da7f3e485de8b00d997e742a0aeba290e0b84f4f Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: 27457c6c4bf04316c795260b65784109bded92b3 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: b2152121983460bcc14289af8dacf475832794c8 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: 1363a4d8974f598f41a2b1495924e14d56ec1b21 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: 805fc59a7df68b2fc15fa99ba141373cd983a21a Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: f1f6ed823acf899a2c45b391063ca8b483147256 Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 18, 2024
ghstack-source-id: e453e6e14c252bd271d89e1a86748dc96622aa7d Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 23, 2024
ghstack-source-id: b4a897a2df48b274ffab02096a48610a074635ac Pull Request resolved: #1041
vmoens
added a commit
that referenced
this pull request
Oct 24, 2024
ghstack-source-id: 55de7c9301c0d22b39e22b44dff553d4fac5adfe Pull Request resolved: #1041
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):