Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for `cudf.pandas` #16739

galipremsagar · 2024-09-04T15:19:42Z

Description

This PR introduces GPU and CPU usage reporting to cudf.pandas pytest suite and the generated metrics will be available for viewing in the existing pandas pytest summary page:
https://github.com/rapidsai/cudf/actions/runs/10886370333/attempts/1#summary-30220192117

Note: I'm aware of cases of where both GPU and CPU usage show 0%, which is due to various reasons that I'm working on addressing in a follow-up PR.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

…rics

… into gpu_cpu_metrics

…rics

… into gpu_cpu_metrics

…rics

galipremsagar

Note: I'm aware of cases of where both GPU and CPU usage shows 0%, which is due to various reasons that I'm working on addressing in a follow-up PR.

Matt711

I noticed that some of the tests have 0% reported for both CPU and GPU usage. I guess we aren't catching some of the dummy function calls (eg. pr_df['_slow_function_call']=0?

Edit: My apologies, you're aware of this already.

…rics

mroeschke · 2024-09-17T21:28:52Z

ci/cudf_pandas_scripts/pandas-tests/job-summary.py

+pr_df['CPU Usage'] = pr_df['CPU Usage'].astype(str) + '%'
+pr_df['GPU Usage'] = pr_df['GPU Usage'].astype(str) + '%'
+
+pr_df['CPU Usage'] = pr_df['CPU Usage'].replace('nan%', '0%')


Suggestion: IMO would be better to fillna(0) before the astype(str) so we don't have to do this replace

mroeschke · 2024-09-17T21:29:32Z

ci/cudf_pandas_scripts/pandas-tests/job-summary.py

@@ -68,8 +68,20 @@ def emoji_failed(x):
 pr_df = pd.DataFrame.from_dict(pr_results, orient="index").sort_index()
 main_df = pd.DataFrame.from_dict(main_results, orient="index").sort_index()
 diff_df = pr_df - main_df
+pr_df['CPU Usage'] = ((pr_df['_slow_function_call']/(pr_df['_slow_function_call'] + pr_df['_fast_function_call']))*100.0).round(1)


Nit: Could you put the denominator calculation on it's own line?

python/cudf/cudf/pandas/scripts/conftest-patch.py

…rics

galipremsagar · 2024-09-19T12:50:00Z

I noticed that some of the tests have 0% reported for both CPU and GPU usage. I guess we aren't catching some of the dummy function calls (eg. pr_df['_slow_function_call']=0?

Edit: My apologies, you're aware of this already.

@Matt711 I pushed a fix to this PR that will improve the missing reports, there would a few instances of such cases now.

Matt711

Thanks! Everything LGTM. Before I approve, I'd like to see the final table.

python/cudf/cudf/pandas/scripts/conftest-patch.py

galipremsagar · 2024-09-19T16:56:08Z

Thanks! Everything LGTM. Before I approve, I'd like to see the final table.

Final table is here: https://github.com/rapidsai/cudf/actions/runs/10941284785/attempts/1#summary-30389636693

galipremsagar · 2024-09-19T17:06:49Z

/merge

working conftest

28bf38e

github-actions bot added Python Affects Python cuDF API. cudf.pandas Issues specific to cudf.pandas labels Sep 4, 2024

galipremsagar added 2 commits September 9, 2024 20:51

Merge remote-tracking branch 'upstream/branch-24.10' into gpu_cpu_met…

a2e25e2

…rics

enable logging

a6b3de5

galipremsagar added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Sep 9, 2024

galipremsagar added 23 commits September 9, 2024 21:58

test

0206872

test

f4364f8

Merge branch 'branch-24.10' into gpu_cpu_metrics

77731bb

test

c6a44a1

Merge branch 'gpu_cpu_metrics' of https://github.com/galipremsagar/cudf…

80f628b

… into gpu_cpu_metrics

test

2cc6e0d

test

ab5ba4e

test

3b7d740

Merge remote-tracking branch 'upstream/branch-24.10' into gpu_cpu_met…

695bf30

…rics

test

264a444

test

337cef8

test

5efca92

test

5e6ec98

test

2200ec2

Merge branch 'branch-24.10' into gpu_cpu_metrics

3ac06df

test

1b7b5a9

Merge branch 'gpu_cpu_metrics' of https://github.com/galipremsagar/cudf…

3702b4c

… into gpu_cpu_metrics

test

b0e4955

test

d2344dc

cleanup

32d3a30

update and cleanup

84c58e1

revert

c4f4cbf

cleanup

23545de

galipremsagar changed the title ~~test pr~~ Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for cudf.pandas Sep 16, 2024

galipremsagar marked this pull request as ready for review September 16, 2024 19:43

galipremsagar requested review from a team as code owners September 16, 2024 19:43

galipremsagar requested review from raydouglass, isVoid and charlesbluca September 16, 2024 19:43

galipremsagar mentioned this pull request Sep 16, 2024

Split pandas pytests to prepare for GPU vs CPU metrics reporting #16743

Closed

3 tasks

Merge remote-tracking branch 'upstream/branch-24.10' into gpu_cpu_met…

cf4a3f4

…rics

galipremsagar self-assigned this Sep 16, 2024

galipremsagar commented Sep 16, 2024

View reviewed changes

Matt711 reviewed Sep 16, 2024

View reviewed changes

galipremsagar added 2 commits September 17, 2024 12:36

Merge remote-tracking branch 'upstream/branch-24.10' into gpu_cpu_met…

ed2bea6

…rics

improve

2a86326

galipremsagar requested a review from mroeschke September 17, 2024 17:24

mroeschke reviewed Sep 17, 2024

View reviewed changes

python/cudf/cudf/pandas/scripts/conftest-patch.py Outdated Show resolved Hide resolved

mroeschke reviewed Sep 17, 2024

View reviewed changes

python/cudf/cudf/pandas/scripts/conftest-patch.py Show resolved Hide resolved

galipremsagar added 2 commits September 19, 2024 12:45

accurately extract metrics

ffdd4d3

Merge remote-tracking branch 'upstream/branch-24.10' into gpu_cpu_met…

15be6ff

…rics

galipremsagar requested review from mroeschke and Matt711 September 19, 2024 12:50

Matt711 reviewed Sep 19, 2024

View reviewed changes

python/cudf/cudf/pandas/scripts/conftest-patch.py Show resolved Hide resolved

Matt711 approved these changes Sep 19, 2024

View reviewed changes

raydouglass approved these changes Sep 19, 2024

View reviewed changes

rapids-bot bot merged commit dafb3e7 into rapidsai:branch-24.10 Sep 19, 2024
100 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for `cudf.pandas` #16739

Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for `cudf.pandas` #16739

galipremsagar commented Sep 4, 2024 •

edited

Loading

galipremsagar left a comment

Matt711 left a comment •

edited

Loading

mroeschke Sep 17, 2024

galipremsagar Sep 19, 2024

mroeschke Sep 17, 2024

galipremsagar Sep 19, 2024

galipremsagar commented Sep 19, 2024

Matt711 left a comment

galipremsagar commented Sep 19, 2024

galipremsagar commented Sep 19, 2024

Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for cudf.pandas #16739

Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for cudf.pandas #16739

Conversation

galipremsagar commented Sep 4, 2024 • edited Loading

Description

Checklist

galipremsagar left a comment

Choose a reason for hiding this comment

Matt711 left a comment • edited Loading

Choose a reason for hiding this comment

mroeschke Sep 17, 2024

Choose a reason for hiding this comment

galipremsagar Sep 19, 2024

Choose a reason for hiding this comment

mroeschke Sep 17, 2024

Choose a reason for hiding this comment

galipremsagar Sep 19, 2024

Choose a reason for hiding this comment

galipremsagar commented Sep 19, 2024

Matt711 left a comment

Choose a reason for hiding this comment

galipremsagar commented Sep 19, 2024

galipremsagar commented Sep 19, 2024

Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for `cudf.pandas` #16739

Generate GPU vs CPU usage metrics per pytest file in pandas testsuite for `cudf.pandas` #16739

galipremsagar commented Sep 4, 2024 •

edited

Loading

Matt711 left a comment •

edited

Loading