Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add library module for converting between erlfmt and syntax tools #237

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

richcarl
Copy link
Contributor

Allows roundtrip conversion from erlfmt:read_nodes() to syntax tools (or erl_parse) and back to erlfmt:format_nodes().

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 14, 2021
@richcarl richcarl force-pushed the syntaxtools-compat branch 2 times, most recently from 5930f24 to db87bdf Compare January 15, 2021 08:00
@awalterschulze
Copy link
Contributor

This certainly looks useful for writing refactoring tools in erlang itself.

We are having a hard time deciding if this belongs in erlfmt or a separate library:
If we merge it in erlfmt, we would need to maintain it and if we don't merge it, then we would need to document our abstract_forms types better. To merge in erlfmt in would need a barrage of tests, that would make us confident that we can maintain it. Are you willing to do this work?

I only skimmed the code, but I saw a lot of TODOs are they intended to be fixed before merge or ...?

@richcarl
Copy link
Contributor Author

No, the TODOs are meant for doing later, they're not blocking anything. But this code needs to find a home first, and I'm intending to do the work on it. Also hoping to work on reducing the internal differences between erlfmt and syntax tools so the conversion becomes less complicated in places.

@awalterschulze
Copy link
Contributor

Do you also intend to add a barrage of tests, something in the order of https://github.com/WhatsApp/erlfmt/blob/master/test/erlfmt_format_SUITE.erl ?

Knowing this would help us to make a decision.

@richcarl
Copy link
Contributor Author

That would probably be relatively easy to do, just adapting the existing tests and checking that the round-tripped code looks the same, so yes.

@awalterschulze
Copy link
Contributor

That would be one way to get quite a large of coverage, but feel free to take some liberty on what you think is sensible.

Do you also maybe have thoughts on why you think this is the right home for this code?

@richcarl
Copy link
Contributor Author

I think this is the right home, because the erlfmt format doesn't officially exist outside this app. By having the conversion functions here, nobody should need to know about it or depend on it. Also, it becomes easy to make synchronized changes in erlfmt internals and the conversion functions with a single commit.

@awalterschulze
Copy link
Contributor

We have talked and you make a compelling argument :D
Lets consider this the future home of this code, but lets also get some tests before merging.

@richcarl
Copy link
Contributor Author

I think this is ready now.

Copy link
Contributor

@awalterschulze awalterschulze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall have to say it looks amazing. Really great work <3
The comments are mostly nitpicks and I am wondering what the strategy is for the TODOs.
Some of them seem related to syntaxtools and others to erlfmt.
Should we open issues to tackle them or what do you think?

GroupProps = ?config(tc_group_properties, Config),
case proplists:get_bool(syntax_tools, GroupProps) of
true ->
put('$syntax_tools$', true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow I didn't know about this magic.
I know this is a minimal diff, but thinking of the next person reading this, maybe it would be better to use a parameter, since each test already has access to Config. Something like:

is_syntax_tools(Config) ->
    proplists:get_bool(syntax_tools, ?config(tc_group_properties, Config)).
    
 some_test(Config) ->
   RoundTripWithSyntaxTools = is_syntax_tools(Config)
   ...
   parse_form(Form, RoundTripWithSyntaxTools)

But I think this was a great way to get test coverage and it is really impressive that everything is passing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to pass things as parameters whenever I can, but all the test code would have to get the extra Options parameter passed all the way down to the parse_form/parse_forms functions, and that seemed too intrusive. I also thought about using meck to modify the behaviour of the test code, but that's also a bit yucky and causes a dep on meck. What do you prefer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer a parameter, since I don't think it needs to pass too far down.
But let's also hear what @michalmuskala has to say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. It's not pretty, but it's practical. I actually wonder if we should try doing something in erlfmt_format_SUITE as well to leverage all the examples we have in there, though I'm not 100% sure how that would look like.

src/erlfmt_ast.erl Outdated Show resolved Hide resolved
end.

st_to_erlfmt_1(Node) ->
%% TODO: should we convert full erl_syntax pos+annotation to erlfmt anno?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you are doing Anno conversions here, so maybe this TODO is already done or I don't understand it, sorry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erl_syntax nodes have a separate annotation field (because the old annotation field used to be only for line numbers), and that could contain arbitrary "user data", so the question is whether to pick up any such annotations and inject them into the erlfmt annotation when going from st to erlfmt, or just abandon them (as done now).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, well I can say that only the relative line numbers are taken into account by the formatting algorithm.
So then the question would be, what would be the use case to preserve them and is that worth the effort.

src/erlfmt_ast.erl Outdated Show resolved Hide resolved
src/erlfmt_ast.erl Show resolved Hide resolved
@richcarl
Copy link
Contributor Author

I'm currently pondering what changes I can make in syntax tools that will preserve compatibility with existing code that's using it, but being better at representing the same things under the hood that erlfmt does. Some of those TODOs require that sort of change first. Some other things may be possible to just change on the erlfmt side (after consulting you) and for those I could open tickets if you like.

@awalterschulze
Copy link
Contributor

Is it possible to label the TODOs in the code as syntax_tools TODOs and actual TODOs for this code, to try and make it clear what is do-able by someone only looking at this code? It would at least help me to discern between them.

-export([erlfmt_to_st/1, st_to_erlfmt/1]).

% dialyzer hates erlfmt_parse:abstract_node()
-type erlfmt() :: term().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably do a bit better with tuple, but I'm not sure it matters that much

%% `macro_call` nodes. Additionally it is less strict - it does not
%% enforce all clauses have the same name and arity.
{function, Pos, Clauses} ->
case get_function_name(Clauses) of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there's mixed elements? Something like:

function(1) -> ok;
?COMMON_HANDLER(function).

Comment on lines +291 to +294
erl_syntax:named_fun_expr(
erlfmt_to_st(Name),
Clauses1
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the clauses weren't consistently named? Erlfmt parses this fine:

fun 
    ?FOO(1) -> ok;
    Foo(2) -> error;
end

Comment on lines +279 to +280
%% * `{type, Anno, Args, Res}` for the anonymous function type
%% `fun((...Args) -> Res)` where `Args` is a `args` node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the fun types handled anywhere?

Comment on lines +378 to +383
{guard_or, Pos, Exprs} ->
AAnno = dummy_anno(),
erlfmt_to_st_1({tuple, Pos, [{atom, AAnno, '*guard_or*'} | Exprs]});
{guard_and, Pos, Exprs} ->
AAnno = dummy_anno(),
erlfmt_to_st_1({tuple, Pos, [{atom, AAnno, '*guard_and*'} | Exprs]});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't those be represented as erl_syntax:conjuction/disjunction? http://erlang.org/doc/man/erl_syntax.html#conjunction-1

%% note that erlfmt only accepts raw_string as a form
"\n<<<<\n" ++ RText = lists:reverse(Text0),
case lists:reverse(RText) of
"[[reparse]]" ++ Rest ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the nodes tagged with [[reparse]] are not wrapped in >>>> markers

[[Name], Args] = st_subtrees_to_erlfmt(Node),
case Name of
{remote, NPos, M, N} ->
exit(remote_type),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the exits here for type conversions?

{atom, _, ok}
]}
]},
[]},
parse_expr("try ok of _ -> ok catch _ -> ok; _:_ -> ok; _:_:_ -> ok end")
%% Note: formatting may drop the Trace part if it's just an underscore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean in erlfmt or during the rountrip through erl_syntax?

GroupProps = ?config(tc_group_properties, Config),
case proplists:get_bool(syntax_tools, GroupProps) of
true ->
put('$syntax_tools$', true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. It's not pretty, but it's practical. I actually wonder if we should try doing something in erlfmt_format_SUITE as well to leverage all the examples we have in there, though I'm not 100% sure how that would look like.

gomoripeti added a commit to gomoripeti/erlang_ls that referenced this pull request Apr 17, 2021
Use erlfmt instead of els_dodger for parsing. Uses modified version of
erlfmt_ast from PR WhatsApp/erlfmt#237 to convert from erlfmt parse tree to
erl_syntax:syntaxTree. Expected benefits are more precise locations and parsing
exotic macros.
gomoripeti added a commit to gomoripeti/erlang_ls that referenced this pull request Apr 22, 2021
Use erlfmt instead of els_dodger for parsing. Uses modified version of
erlfmt_ast from PR WhatsApp/erlfmt#237 to convert from erlfmt parse tree to
erl_syntax:syntaxTree. Expected benefits are more precise locations and parsing
exotic macros.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants