Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try using BitArray for cloud masks #451

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

charleskawczynski
Copy link
Member

I'm curious if this works on GPUs

@sriharshakandala
Copy link
Member

Does this improve performance?

@charleskawczynski
Copy link
Member Author

Does this improve performance?

🤷🏻

@sriharshakandala
Copy link
Member

Current main:

julia --project=gpuenv test/all_sky_tuning.jl 
device = ClimaComms.CUDADevice(); FT = Float64, ncols = 131658; size per field = 0.04119899868965149 GB
"timing longwave solver" = "timing longwave solver"
  1.159210 seconds (66 CPU allocations: 14.969 KiB)
  1.158549 seconds (65 CPU allocations: 14.891 KiB)
  1.158072 seconds (66 CPU allocations: 15.000 KiB)
  1.157427 seconds (45 CPU allocations: 13.094 KiB)
  1.157513 seconds (45 CPU allocations: 13.094 KiB)
"timing shortwave solver" = "timing shortwave solver"
  0.863498 seconds (51 CPU allocations: 13.469 KiB)
  0.862782 seconds (51 CPU allocations: 13.469 KiB)
  0.863073 seconds (51 CPU allocations: 13.469 KiB)
  0.862254 seconds (51 CPU allocations: 13.469 KiB)
  0.864140 seconds (51 CPU allocations: 13.469 KiB)
 39.751985 seconds (97.94 M allocations: 5.623 GiB, 4.70% gc time, 54.58% compilation time: 1% of which was recompilation)

This branch:

julia --project=gpuenv test/all_sky_tuning.jl 
device = ClimaComms.CUDADevice(); FT = Float64, ncols = 131658; size per field = 0.04119899868965149 GB
"timing longwave solver" = "timing longwave solver"
  1.160132 seconds (66 CPU allocations: 14.969 KiB)
  1.157305 seconds (65 CPU allocations: 14.891 KiB)
  1.156251 seconds (66 CPU allocations: 15.000 KiB)
  1.156213 seconds (45 CPU allocations: 13.094 KiB)
  1.157813 seconds (45 CPU allocations: 13.094 KiB)
"timing shortwave solver" = "timing shortwave solver"
  0.863435 seconds (51 CPU allocations: 13.469 KiB)
  0.861900 seconds (51 CPU allocations: 13.469 KiB)
  0.863441 seconds (51 CPU allocations: 13.469 KiB)
  0.861509 seconds (51 CPU allocations: 13.469 KiB)
  0.863258 seconds (51 CPU allocations: 13.469 KiB)
 39.466558 seconds (97.92 M allocations: 5.624 GiB, 4.58% gc time, 53.51% compilation time: 1% of which was recompilation)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants