Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utilities for running modifications inside nested IR nodes #4940

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

vlstill
Copy link
Contributor

@vlstill vlstill commented Oct 3, 2024

Normally when a programmer wants to modify a component of an IR node, unless the component is an inline member, they need to manually clone all the components in path to what they want to modify. For example, if one wants to replace an argument of a method call statement, one has to clone the arguments, then clone the argument in question, then assign expression in the argument, and then assign the cloned values back to the original once. This can get tedious very fast and it also hides the important part (modification) in a bunch of boilerplate (cloning, assignments).

Of course, this concerns modifications that are based on "location" (in C++ object), not on type (for that case we have visitors :-)).

So, I've created a templated function P4::IR::Traversal::modify that can automate such deeply-nested modifications. This function takes an object (usually an IR node, but theoretically it can be a different kind of object) and a bunch of selectors, which specify what path through the object to take and how to modify the finally selected sub-object. There are also helper classes in the P4::IR::Traversal namespace and there can be more in future.

Example is in the documentation:

p4c/ir/ir-traversal.h

Lines 154 to 171 in 2e4ce5e

/// For example, given an extern call, you can modify the type of the first argument by adding a
/// cast to it like this:
/// @code
/// modify(call, &IR::MethodCallStatement::methodCall, &IR::MethodCallExpression::arguments,
/// Index(0), &IR::Argument::expression, [&](const IR::Expression *expr)
/// {
/// return new IR::Cast(expr->srcInfo, idxType, expr);
/// });
/// @endcode
///
/// Similarly, you can set the type of the first argument in the type of the extern like this:
/// @code
/// modify(call, &IR::MethodCallStatement::methodCall, &IR::MethodCallExpression::method,
/// &IR::Expression::type, RTTI::to<IR::Type_Method>, &IR::Type_Method::parameters,
/// &IR::ParameterList::parameters, Index(0), &IR::Parameter::type, Assign(idxType));
/// @endcode
///
/// Any time a cast fails or an index is out of range the modification triggers a `BUG`.

Signed-off-by: Vladimír Štill <[email protected]>
Signed-off-by: Vladimír Štill <[email protected]>
Signed-off-by: Vladimír Štill <[email protected]>
Signed-off-by: Vladimír Štill <[email protected]>
Signed-off-by: Vladimír Štill <[email protected]>
@vlstill vlstill self-assigned this Oct 3, 2024
@vlstill vlstill added the core Topics concerning the core segments of the compiler (frontend, midend, parser) label Oct 3, 2024
@vlstill vlstill marked this pull request as ready for review October 3, 2024 13:53
Signed-off-by: Vladimír Štill <[email protected]>
} // namespace Detail

/// @brief Given an object @p obj and a series of selector (ending with a modifier), modify the @p
/// obj's sub object at the selecte path. %This is useful for deeper modification of objects that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// obj's sub object at the selecte path. %This is useful for deeper modification of objects that
/// obj's sub object at the selected path. This is useful for deeper modification of objects that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The % is there actually for a reason, although it is an annoying one. Otherwise Doxygen picks IR::This and adds links to it, which is quite annoying.

modify(stmt, RTTI::to<IR::AssignmentStatement>, &IR::AssignmentStatement::right,
RTTI::to<IR::Operation_Binary>, &IR::Operation_Binary::left,
RTTI::to<IR::PathExpression>, &IR::PathExpression::path, &IR::Path::name, &IR::ID::name,
IR::Traversal::Assign(P4::cstring("bar")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very ugly/hard to use, and seems like it could benefit from somthing like the IR::pattern stuff in ir/pattern.h. Not sure how that could work, but if you could write this more like

modify(stmt, Assignment(AnyExpression, placeholder + AnyExpression), something to set placeholder....

it would be much cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it is not nice, but it is nicer than the alternative:

  • you could write the clones and assignments manually, which is very ugly, much uglier than this, and it hard to see if it is correct;
  • you could visit the argument and then match on its context, which would be more expensive and if you need to modify multiple sub-expressions, you would end up with very spread code which would also run the checks on context many times.

Of course, I would like it very much if I could just pass the "field chain" (e.g. obj->left->type) to a function like that, but that is not possible. The member pointers are clunky.

As for pattern-like style -- the disadvantage of Pattern is that it is a separate partial mirror image of the IR. Also, I think that the pattern could get quite complicated if we tried to allow casts in them. If we wanted to have something like patterns, it would make sense to study what was proposed for the C++ pattern matching (because they tried to be quite general) and to try to create some way to generate the patters in IR generator. But that would be a major undertaking, while this one was quite quick to do (and unlike patter.h it is documented).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note, that the pattern syntax gets poorly readable quite fast -- already in your example, one must guess that the two parameters of Assignment are left and right and that type, srcInfo and annotations are not shown in the pattern -- but what if there is need to modify these? This quill get quite tricky once more nested patterns come into play (also because one has to somehow represent them in C++ which is not meant for this).

vlstill and others added 2 commits October 4, 2024 10:25
Signed-off-by: Vladimír Štill <[email protected]>
Signed-off-by: Vladimir Still <[email protected]>
@vlstill vlstill requested review from asl and ChrisDodd and removed request for asl October 7, 2024 10:57
@vlstill
Copy link
Contributor Author

vlstill commented Oct 8, 2024

@fruffy, @asl, @ChrisDodd, is there interest for this in P4C? The PR seems a little stuck now...

If there is an existing good way to do this, let me know, but I don't know about any.

@fruffy
Copy link
Collaborator

fruffy commented Oct 8, 2024

@fruffy, @asl, @ChrisDodd, is there interest for this in P4C? The PR seems a little stuck now...

If this works like I think it works it could be useful for modifications of tables (for example keys or properties), which are often deeply nested.

Maybe we can replace some existing code with this helper to show its utility?

@asl
Copy link
Contributor

asl commented Oct 8, 2024

@fruffy, @asl, @ChrisDodd, is there interest for this in P4C? The PR seems a little stuck now...

If there is an existing good way to do this, let me know, but I don't know about any.

Sorry, was a bit distracted. I will take a look soon

@vlstill
Copy link
Contributor Author

vlstill commented Oct 8, 2024

Maybe we can replace some existing code with this helper to show its utility?

Interestingly, it seems the pattern mainly shows in P4Testgen, in the various steppers that replace parts of expressions. I've replaced a few places where the nesting was deeper than 1 so you can try to have a look.

@vlstill
Copy link
Contributor Author

vlstill commented Oct 8, 2024

If I have some free cycles, we could later get rid of the member pointers by generating "universal accessors", something like e.g. Traversal::expression which would be able to get expression from any IR node that contains it (even from unrelated nodes). That would allow a bit nicer formulation like T::apply(subexpr, T::components, T::Index(nonValueComponentIdx), T::expression, T::Assign(v->param)) (where `namespace T = IR::Traversal) instead of

auto *result = IR::Traversal::apply(subexpr, &IR::StructExpression::components,
IR::Traversal::Index(nonValueComponentIdx),
&IR::NamedExpression::expression,
IR::Traversal::Assign(v->param));

... but that requires generating a new header from ir-generator so I would need to take a deep look at that. Also, it would work much better with concepts (so C++20), although I'm not sure I'd squeeze into the intersection of GCC 9 and standardized C++20.

(*arguments)[0] = arg;
clonedCall->arguments = arguments;
const auto *clonedCall =
replaceCallArg(&externInfo.originalCall, 0, v->param);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely an improvement and helps avoiding accidental bugs by getting an index or variable wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Topics concerning the core segments of the compiler (frontend, midend, parser)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants