Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong function call in model_parallel_v2 #4799

Open
juna962 opened this issue Dec 17, 2024 · 0 comments
Open

Wrong function call in model_parallel_v2 #4799

juna962 opened this issue Dec 17, 2024 · 0 comments

Comments

@juna962
Copy link

juna962 commented Dec 17, 2024

Hello everyone,

There is a bug in the following Python file:
training/distributed_training/pytorch/model_parallel_v2/shared-scripts/logging_utils.py

At line 151, the code reads:

python
Copy code
avg_tflops = compute_tflops(avg_throughput, num_params, world_size, batch_seqlen)
However, the function definition in:
training/distributed_training/pytorch/model_parallel_v2/shared-scripts/train_utils.py
at line 36 is as follows:

python
Copy code
def compute_tflops(args, global_batch_size, step_time, world_size):
The arguments in the function call should be updated to match the definition. Specifically, args must be passed as the first argument. Alternatively, a new function needs to be defined to accommodate the current arguments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant