Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Closes #1431) Initial changes to add the canonical loop trans abstract class #2190

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

LonelyCat124
Copy link
Collaborator

@LonelyCat124 LonelyCat124 commented Jun 23, 2023

Having some issues with the testing here, so I wanted to make this PR for potential discussion on why tests are failing if I can't work it out.

The canonical loop validation testing is not yet implemented either.

@LonelyCat124 LonelyCat124 changed the title Initial changes to add the canonical loop trans abstract class (Closes #1431) Initial changes to add the canonical loop trans abstract class Jun 23, 2023
@LonelyCat124
Copy link
Collaborator Author

Ok, I missed some cases I didn't think about.
In one of the Nemo examples, somewhere we get a NemoLoop with a start variable that is formed of:
BinaryOperation(LBOUND(zwx, 2))

The loop is created from:

zwx(:,:,jpk) = 0.e0

and transformed by PSyclone to:

do idx = LBOUND(zwx, 2), UBOUND(zwx, 2), 1
  do idx_1 = LBOUND(zwx, 1), UBOUND(zwx, 1), 1
    zwx(idx_1,idx,jpk) = 0.e0
  enddo
enddo

This looks to be ok w.r.t canonical loop form, but is currently not handled correctly.
I think I need to be smarter checking invariance somehow, since the LBOUND and UBOUND statements should not need to worry about the array they act upon being written to, and is a common use-case.
I could ignore BinaryOperations containing UBOUND/LBOUND explicitly, but I'm not sure if that is the correct approach.

@arporter
Copy link
Member

arporter commented Jun 23, 2023

I think I need to be smarter checking invariance somehow, since the LBOUND and UBOUND statements should not need to worry about the array they act upon being written to, and is a common use-case.

This is something we've hit elsewhere. Call nodes now have the is_pure and is_elemental properties. @sergisiso has suggested we add another one, something like enquiry_fn that tells us that the routine doesn't actually access/care about the data in its argument. That could then naturally be handled in the dependence analysis rather than having to special case.
EDIT: in fact, I think the dependence analysis has a flag that can control what it does about these statements. @hiker will know if he's awake at the moment.

See e.g. https://psyclone-ref.readthedocs.io/en/latest/_static/html/classpsyclone_1_1psyir_1_1tools_1_1dependency__tools_1_1DependencyTools.html#a619bd6530f6be092fc4c873b017e332c

node.children[0].operator == BinaryOperation.Operator.MUL and
(node.children[0].children[0].symbol == loop_var or
node.children[0].children[1].symbol == loop_var)):
return True
Copy link
Collaborator

@hiker hiker Jun 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests still don't feel right. Too many very similar tests, and I am not even sure that they are complete (what about -a1\*var-a2 or +a1\*var+a2, which from memory adds another UnaryOperation node).
SymPy offers a polynomial class, which gives you access to the coefficients, and for the example above at least it will convert them all to a similar structure:

>>> sp.Poly(a1 * var - a2, var).all_coeffs()
[a1, -a2]
>>> sp.Poly(-a1 * var - a2, var).all_coeffs()
[-a1, -a2]
>>> sp.Poly(a2 - a1 * var,var).all_coeffs()
[-a1, a2]

etc. Unfortunately, it does still first simplify the expressions, so

>>> sp.Poly(a2 - a1 * var + 2*var,var).all_coeffs()
[2 - a1, a2]

Could we perhaps in a case like this rewrite the PSyIR with the equivalent expression (psyad is already doing this when expanding terms). Or you could additionally count the number of references and symbols in the original PSyIR, which should be less than or equal to 4? I checked, even in the SymPy internal representation the -a1 is a special node (multiplication of a1 by -1, so that doesn't really help).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked again and my interpretation of the OpenMP standard was also wrong, in that a1 and a2 simply need to be "loop invariant integer expressions", and not simply variables.
I think I should have already checked that any Reference objects in these terms are loop invariant in validate before reaching this function, so what actually needs to be done is:

  1. Check we have some structure that could be handled as one of the allowed structures.
  2. Check that the a1 and a2 expressions (if present) are integer expressions.
    I wonder if 2 would have been handled already during input/tree construction or if we can just ignore it.
    I don't have a good idea how to check the structure conforms though, maybe the sp.Poly class you recommend will work straight out the box? But I'm not clear how to use this - could you provide a quick example?

I'm trying to work out what this actually disallows, and I think any operation involving all integer values/references, and only +-* operations is allowed (since you could transform x*y*var_outer into a1*var_outer since a1=x*y. Equally division (maybe any integer-safe) operations should be ok provided they are not performed on var_outer. Does that make sense or have I missed something?

@hiker
Copy link
Collaborator

hiker commented Jun 24, 2023

I think I need to be smarter checking invariance somehow, since the LBOUND and UBOUND statements should not need to worry about the array they act upon being written to, and is a common use-case.
The loop is created from:

zwx(:,:,jpk) = 0.e0

and transformed by PSyclone to:

do idx = LBOUND(zwx, 2), UBOUND(zwx, 2), 1
  do idx_1 = LBOUND(zwx, 1), UBOUND(zwx, 1), 1
    zwx(idx_1,idx,jpk) = 0.e0
  enddo
enddo

This looks to be ok w.r.t canonical loop form, but is currently not handled correctly. I think I need to be smarter checking invariance somehow, since the LBOUND and UBOUND statements should not need to worry about the array they act upon being written to, and is a common use-case. I could ignore BinaryOperations containing UBOUND/LBOUND explicitly, but I'm not sure if that is the correct approach.

Not sure if I understand the exact problem - in the example eg1 this loop does not even reach your new code, since the loop type is unknown, it is skipped because the user script only omp-parallelises levels loops:

       for loop in invoke.schedule.walk(Loop):
            if loop.loop_type == "levels":

I somehow feel that I missed something ;)

@hiker
Copy link
Collaborator

hiker commented Jun 24, 2023

This is something we've hit elsewhere. Call nodes now have the is_pure and is_elemental properties. @sergisiso has suggested we add another one, something like enquiry_fn that tells us that the routine doesn't actually access/care about the data in its argument. That could then naturally be handled in the dependence analysis rather than having to special case. EDIT: in fact, I think the dependence analysis has a flag that can control what it does about these statements. @hiker will know if he's awake at the moment.

See e.g. https://psyclone-ref.readthedocs.io/en/latest/_static/html/classpsyclone_1_1psyir_1_1tools_1_1dependency__tools_1_1DependencyTools.html#a619bd6530f6be092fc4c873b017e332c

Yes, anything that uses VariablesAccessInfo will by default ignore shape-related accesses, and so will not be reported (you have to explicitly specify a flag if you want these accesses reported).

@LonelyCat124
Copy link
Collaborator Author

LonelyCat124 commented Jun 28, 2023

I think I need to be smarter checking invariance somehow, since the LBOUND and UBOUND statements should not need to worry about the array they act upon being written to, and is a common use-case.
The loop is created from:

zwx(:,:,jpk) = 0.e0

and transformed by PSyclone to:

do idx = LBOUND(zwx, 2), UBOUND(zwx, 2), 1
  do idx_1 = LBOUND(zwx, 1), UBOUND(zwx, 1), 1
    zwx(idx_1,idx,jpk) = 0.e0
  enddo
enddo

This looks to be ok w.r.t canonical loop form, but is currently not handled correctly. I think I need to be smarter checking invariance somehow, since the LBOUND and UBOUND statements should not need to worry about the array they act upon being written to, and is a common use-case. I could ignore BinaryOperations containing UBOUND/LBOUND explicitly, but I'm not sure if that is the correct approach.

Not sure if I understand the exact problem - in the example eg1 this loop does not even reach your new code, since the loop type is unknown, it is skipped because the user script only omp-parallelises levels loops:

       for loop in invoke.schedule.walk(Loop):
            if loop.loop_type == "levels":

I somehow feel that I missed something ;)

So from examples/nemo/eg1 the test psyclone -l output --config /home/aidan/LFRIC/PSyclone/config/psyclone.cfg -api nemo -s ../scripts/omp_cpu_trans.py ../code/tra_adv.F90 is failing as it hits the assertion inside the validate function:
for ref in start_refs when
if (all_ref_accesses is not None and all_ref_accesses.is_written())

I have added some code to output the node using FortranWriter and my output is:

do idx = LBOUND(zwx, 2), UBOUND(zwx, 2), 1
  do idx_1 = LBOUND(zwx, 1), UBOUND(zwx, 1), 1
    zwx(idx_1,idx,jpk) = 0.e0
  enddo
enddo

I think without the canonical loop validation check this loop will parallelise with OpenMP - and I think it should. I don't believe there is anything in this script that would mean this isn't attempted to be parallelised? The rough structure of this PSyclone (ignoring a few safety steps) is:

for loop in schedule.walk(Loop):
    loop_directive_trans.apply(loop)
    region_directive_trans.apply(loop.parent.parent)

If you do make from the root examples directory it should hit this error on this branch.

The error is in my code, as I incorrectly detect the "read" of zwx to not be loop invariant, as zwx is written to in the loop.
How does the VariablesAccessInfo report info for something like UBOUND(zwx, 2)? What I could try to do is also compute the variables access info for the start/stop/step children independently as well, and only things that are read from in those and written to in the loop schedule would be non-loop invariant, but it depends how we treat UBOUND/LBOUND. Otherwise I'll have to do something a bit more messy.

Edit: I think your response suggests this should be ok so I'll try that.

@LonelyCat124
Copy link
Collaborator Author

The changes fix the known errors in the examples, so now the tests I had failing locally are also failing on github.
I'm unclear on the cause at all, they're all lfric tests which shouldn't have their behaviour changed as far as I'm aware?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants