-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add library module for converting between erlfmt and syntax tools #237
base: main
Are you sure you want to change the base?
Conversation
5930f24
to
db87bdf
Compare
This certainly looks useful for writing refactoring tools in erlang itself. We are having a hard time deciding if this belongs in erlfmt or a separate library: I only skimmed the code, but I saw a lot of TODOs are they intended to be fixed before merge or ...? |
No, the TODOs are meant for doing later, they're not blocking anything. But this code needs to find a home first, and I'm intending to do the work on it. Also hoping to work on reducing the internal differences between erlfmt and syntax tools so the conversion becomes less complicated in places. |
Do you also intend to add a barrage of tests, something in the order of https://github.com/WhatsApp/erlfmt/blob/master/test/erlfmt_format_SUITE.erl ? Knowing this would help us to make a decision. |
That would probably be relatively easy to do, just adapting the existing tests and checking that the round-tripped code looks the same, so yes. |
That would be one way to get quite a large of coverage, but feel free to take some liberty on what you think is sensible. Do you also maybe have thoughts on why you think this is the right home for this code? |
I think this is the right home, because the erlfmt format doesn't officially exist outside this app. By having the conversion functions here, nobody should need to know about it or depend on it. Also, it becomes easy to make synchronized changes in erlfmt internals and the conversion functions with a single commit. |
We have talked and you make a compelling argument :D |
db87bdf
to
0bdb897
Compare
0bdb897
to
7158644
Compare
I think this is ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall have to say it looks amazing. Really great work <3
The comments are mostly nitpicks and I am wondering what the strategy is for the TODOs.
Some of them seem related to syntaxtools and others to erlfmt.
Should we open issues to tackle them or what do you think?
GroupProps = ?config(tc_group_properties, Config), | ||
case proplists:get_bool(syntax_tools, GroupProps) of | ||
true -> | ||
put('$syntax_tools$', true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow I didn't know about this magic.
I know this is a minimal diff, but thinking of the next person reading this, maybe it would be better to use a parameter, since each test already has access to Config. Something like:
is_syntax_tools(Config) ->
proplists:get_bool(syntax_tools, ?config(tc_group_properties, Config)).
some_test(Config) ->
RoundTripWithSyntaxTools = is_syntax_tools(Config)
...
parse_form(Form, RoundTripWithSyntaxTools)
But I think this was a great way to get test coverage and it is really impressive that everything is passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to pass things as parameters whenever I can, but all the test code would have to get the extra Options parameter passed all the way down to the parse_form/parse_forms functions, and that seemed too intrusive. I also thought about using meck to modify the behaviour of the test code, but that's also a bit yucky and causes a dep on meck. What do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer a parameter, since I don't think it needs to pass too far down.
But let's also hear what @michalmuskala has to say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. It's not pretty, but it's practical. I actually wonder if we should try doing something in erlfmt_format_SUITE
as well to leverage all the examples we have in there, though I'm not 100% sure how that would look like.
end. | ||
|
||
st_to_erlfmt_1(Node) -> | ||
%% TODO: should we convert full erl_syntax pos+annotation to erlfmt anno? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you are doing Anno conversions here, so maybe this TODO is already done or I don't understand it, sorry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
erl_syntax nodes have a separate annotation field (because the old annotation field used to be only for line numbers), and that could contain arbitrary "user data", so the question is whether to pick up any such annotations and inject them into the erlfmt annotation when going from st to erlfmt, or just abandon them (as done now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, well I can say that only the relative line numbers are taken into account by the formatting algorithm.
So then the question would be, what would be the use case to preserve them and is that worth the effort.
I'm currently pondering what changes I can make in syntax tools that will preserve compatibility with existing code that's using it, but being better at representing the same things under the hood that erlfmt does. Some of those TODOs require that sort of change first. Some other things may be possible to just change on the erlfmt side (after consulting you) and for those I could open tickets if you like. |
Is it possible to label the TODOs in the code as syntax_tools TODOs and actual TODOs for this code, to try and make it clear what is do-able by someone only looking at this code? It would at least help me to discern between them. |
-export([erlfmt_to_st/1, st_to_erlfmt/1]). | ||
|
||
% dialyzer hates erlfmt_parse:abstract_node() | ||
-type erlfmt() :: term(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably do a bit better with tuple
, but I'm not sure it matters that much
%% `macro_call` nodes. Additionally it is less strict - it does not | ||
%% enforce all clauses have the same name and arity. | ||
{function, Pos, Clauses} -> | ||
case get_function_name(Clauses) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there's mixed elements? Something like:
function(1) -> ok;
?COMMON_HANDLER(function).
erl_syntax:named_fun_expr( | ||
erlfmt_to_st(Name), | ||
Clauses1 | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the clauses weren't consistently named? Erlfmt parses this fine:
fun
?FOO(1) -> ok;
Foo(2) -> error;
end
%% * `{type, Anno, Args, Res}` for the anonymous function type | ||
%% `fun((...Args) -> Res)` where `Args` is a `args` node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the fun types handled anywhere?
{guard_or, Pos, Exprs} -> | ||
AAnno = dummy_anno(), | ||
erlfmt_to_st_1({tuple, Pos, [{atom, AAnno, '*guard_or*'} | Exprs]}); | ||
{guard_and, Pos, Exprs} -> | ||
AAnno = dummy_anno(), | ||
erlfmt_to_st_1({tuple, Pos, [{atom, AAnno, '*guard_and*'} | Exprs]}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't those be represented as erl_syntax:conjuction/disjunction
? http://erlang.org/doc/man/erl_syntax.html#conjunction-1
%% note that erlfmt only accepts raw_string as a form | ||
"\n<<<<\n" ++ RText = lists:reverse(Text0), | ||
case lists:reverse(RText) of | ||
"[[reparse]]" ++ Rest -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the nodes tagged with [[reparse]]
are not wrapped in >>>>
markers
[[Name], Args] = st_subtrees_to_erlfmt(Node), | ||
case Name of | ||
{remote, NPos, M, N} -> | ||
exit(remote_type), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the exits here for type conversions?
{atom, _, ok} | ||
]} | ||
]}, | ||
[]}, | ||
parse_expr("try ok of _ -> ok catch _ -> ok; _:_ -> ok; _:_:_ -> ok end") | ||
%% Note: formatting may drop the Trace part if it's just an underscore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in erlfmt or during the rountrip through erl_syntax
?
GroupProps = ?config(tc_group_properties, Config), | ||
case proplists:get_bool(syntax_tools, GroupProps) of | ||
true -> | ||
put('$syntax_tools$', true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine. It's not pretty, but it's practical. I actually wonder if we should try doing something in erlfmt_format_SUITE
as well to leverage all the examples we have in there, though I'm not 100% sure how that would look like.
Use erlfmt instead of els_dodger for parsing. Uses modified version of erlfmt_ast from PR WhatsApp/erlfmt#237 to convert from erlfmt parse tree to erl_syntax:syntaxTree. Expected benefits are more precise locations and parsing exotic macros.
Use erlfmt instead of els_dodger for parsing. Uses modified version of erlfmt_ast from PR WhatsApp/erlfmt#237 to convert from erlfmt parse tree to erl_syntax:syntaxTree. Expected benefits are more precise locations and parsing exotic macros.
Allows roundtrip conversion from erlfmt:read_nodes() to syntax tools (or erl_parse) and back to erlfmt:format_nodes().