Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(v1/nodejs): add modules nodejs-node-modules and nodejs-devshell #540

Merged
merged 3 commits into from
Jul 4, 2023

Conversation

DavHau
Copy link
Member

@DavHau DavHau commented Jun 29, 2023

This adds two new modules for nodejs:

  • nodejs-node-modules (has 2x node in name. is just nodejs-modules better?)
  • nodejs-devshell

The shellHook logic for the dev-shell was largely copied from the current strict-nodejs builder.
It currently uses a derivation to create a full copy of the nodejs dependency tree, similar to how npm would do it.
The shellHook then rsyncs that store-path (if changed) to ./node_modules and creates ./node_modules/.bin.

This is not necessarily the final solution, but it delivers a writable ./node_modules allowing the user to execute npm for simple operations like npm install <package name> for example.

We have been discussing a read-only, symlinked node_modules which would be even quicker to install and potentially cleaner in some scenarios, but this would require some more engineering in order to retain a working npm command or to provide an alternative command. We can still improve the mechanism in the future.

Also an example is added for testing the new modules on prettier.

@DavHau DavHau requested review from phaer and hsjobeki June 29, 2023 14:35
Copy link
Contributor

@hsjobeki hsjobeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think nodejs-node-modules is fine.
nodejs is the ecosystem
node-modules is the sub-part of that ecosystem.
Otherwise you could name it deps ^^ that term is generic and often used in other package frameworks. I let it up to you to decide.

Edit: Maybe nodejs-deps is better for the reasons: shorter terms express more generic modules, longer names are more specific.

then
${config.deps.rsync}/bin/rsync -a --chmod=ug+w --delete ${nodeModulesDir}/ ./node_modules/
mkdir -p ./node_modules/.bin
for executablePath in ${binDir}/*; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels overengineered. Why isn't the .bin folder already rsynced?
If the node_modules (id) is changed you also re-create every .bin/* entry. Have you checked if ln succeeds when a link with the same name is already present? (I think it fails and then you have to do many conditional stuff here)

In the strict-builder I didn't do this kind of magic because the bins are part of the node_modules subfolder

https://github.com/nix-community/dream2nix/blob/main/src/subsystems/nodejs/builders/strict-builder/lib/devShell.nix

I am not up-to what the granular builder is doing currently but maybe it should already include .bin entries such that you can avoid this loop here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and don't understand why it's necessary here, if it is?

checked if ln succeeds when a link with the same name is already present? (I think it fails and then you have to do many conditional stuff here)

That one alone could be solved with ln -sf :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Providing some background, the granular builder currently installs the top-level dependency inside a top-level node_modules. In a way you could say that the builder treats the package as a dependency of itself. Therefore it creates the .bin entry one layer up (to high for our use case here).

If this was a smart choice or not can be debated, but changing this requires larger refactoring.

Also I think that it's hard to find a correct generic way of building the .bin directory, because it's content depends on multiple variables like: where the package is located in the tree, how the tree is built. This information is not yet known when building an individual dependency, therefore harder to pre-build/cache it.

Therefore, depending on which settings we choose for the builder, like installMethod fore example, the logic for generating .bin has to change.

Via 67cf11a I now improved the nodejs-node-modules module to already build the .bin in the way we want to have it for a dev-shell, and therefore removed this logic from the devshell module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... this allows relocating the node_modules via a single rsync

@hsjobeki
Copy link
Contributor

hsjobeki commented Jun 29, 2023

With the module's approach, it feels very clean so far. 😀

dontInstall = true;
dontFixup = true;
preBuildPhases = l.mkForce [];
preInstallPhases = l.mkForce [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused: We import the more complex modules, nodejs-package-lock and nodejs-granular here, just to disable most of their functionality?

Wouldn't that be cleaner the other way round? i.e. basing the imported modules on this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally yes, refactoring the design like that sounds like it would make sense. But currently I'm unsure how much work that implies.

The granular builder does involve more than just defining phases, like, for example computing the datastructure for dependencies correctly. We'd have to factor out that logic first.

If we want to proceed bulding ontop of the granular builder logic, we should probably split up the builder module into several smaller ones. For example we would probably need a nodejs-dependencies module to share dependency logic between different other modules.

};

nodejs-granular = {
installMethod = "copy";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ftr: I still don't believe that's a good default. It's a necessary workaround for some popular broken packages like webpack, so we need the switch. But default still seems wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends I'd say. Do you want the default setting to just work for all packages. Then copy is your choice.
I'd estimate (without knowing) that 50 % of node packages would fail building with symlink strategy.

Maye we should just ty and then decide based on the amount of failing packages?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where that 50% number comes from? I think it heavily depends on the set of packages you use. Symlinks aren't only an essential component of Posix but also of nix specifically as you all know.

So i think it's fair to consider software which doesn't handle them as broken and just document a setting to override where needed.

Forget wasted disk IO and space, th idea of keeping node_modules writeable to appease broken node_modules means that you can't be sure the node_modules you use is the one you've installed from your lockfile; breaking one of the core-assumptions of dream2nix.

Copy link
Contributor

@hsjobeki hsjobeki Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay from dream2nix perspective I agree.
That implies using npm install --package-lock-only with that flag always. Because npm cannot add new dependencies to the node_modules folder it must write to the lockfile first, which then updates the devShell. (sound reasonable)

We should clearly point it out in the docs then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in the long term, as @DavHau already mentioned we should provide a d2n command line utility that provide a shortcut for such general tasks. (and also don't need to remember)

Copy link
Member

@phaer phaer Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a better cli entrypoint than run scripts like "update-locks" would be a good QoL improvement in the medium term.

Regarding the npm flag: I think that would be ideal if possible and i think floco implies that it's possible? Might be missing something, you are definitely more experienced here!
Otherwise: would it be possible to keep ./node_modules itself as directory, but symlink everything by default (and copy specific packages via override) inside it? Or would it make things worse?

Might need more discussion. In the context of this PR is still don't think we should just set it to "copy" and forget about it. Lets err on the side of loudly breaking builds instead of slower builds which might silently diverge.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with symlinks is that they introduce a duality in interpretation. Software can either ignore the fact that a symlink is a symlink and interpret it as file or directory, or alternatively interpret the symlink as such, resolve it, and look at the target, ignoring the fact where it was linked from.

So i think it's fair to consider software which doesn't handle them as broken

The way I see it, a software cannot not handle symlinks, it has to pick one of the interpretations. Not sure if one of them is wrong. It might depend on the use-case.

Both ways of interpreting symlinks result in different perspectives, and both are supported by linux, so I'm not sure if we can declare software as broken just because they picked one of the two interpretations. They might have had their reasons which we don't understand.

Forget wasted disk IO and space, th idea of keeping node_modules writeable to appease broken node_modules means that you can't be sure the node_modules you use is the one you've installed from your lockfile

Two different problems are getting mixed up here:

    1. Resolution errors in nodejs build tools. -> installMethod=copy is the solution (if writable or not doesn't matter)
    1. npm in a dev-shell wants a writable node_modules for some operations.

It's not that we keep node_modules writable because they are broken. They are not. It's just to allow npm commands to work.
We can get rid of a writable node_modules by ensuring that we implement an alternative for the npm commands that would break or decide to not support using npm partially or fully.
Maybe this is easy in the end but requires some studying to be sure that we don't break stuff, so I'd prefer to do this in as a separate project.


In a nutshell, within the node ecosystem most tools have chosen to interpret symlinks not in favor of how we'd like to install node_modules with dream2nix.

I propose we call it not optimal instead of broken.

One of the core ideas of dream2nix is to reduce user intervention to a minimum. I think the best way to achieve this is by prioritizing ecosystem compatibility over performance.

Therefore our defaults should be chosen to maximize compatibility, not performance. Performance is important, but not as much as compatibility.

If someone wants to pick a more performant strategy like installMethod=symlink with the cost of increasing the potential for build failures, then they can do this, but it doesn't seem to be a good default considering our goals.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically installMethod=symlink has been the default for dream2nix. My idea was that we'd patch all problematic tools to deal with it. It resulted in too much overhead for my taste. We'd have to patch several different versions of each problematic tooling etc. I think we could go back to this approach in the future, but right now I don't have the capacity for maintaining all the patches.

Copy link
Member

@phaer phaer Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with symlinks is that they introduce a duality in interpretation. Software can either ignore the fact that a symlink is a symlink and interpret it as file or directory, or alternatively interpret the symlink as such, resolve it, and look at the target, ignoring the fact where it was linked from.

Not sure I understand this correctly, because i think if you "ignore the fact that a symlink is a symlink", then it's always a file - never a directory? In my understanding your latter alternative is favorable in pretty much all cases (exceptions include software which is explicitly designed to manage symlinks). I'd assume the main problem with that in the npm ecosystem would be that ".." wouldn't refer to node_modules anymore after a link is resolved?

Anyway, I believe the thing described above should eventually be possible in dream2nix as floco seems to achieve it already.
But i don't mean to block anything here, feel free to merge ofc. (would have requested changes via Github review in other cases)

Therefore our defaults should be chosen to maximize compatibility, not performance. Performance is important, but not as much as compatibility.

I'd argue, correctness, opt-in compatibality, performance ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh damn symlinks. Even talking about them is confusing. what I meant with "ignore the fact that a symlink is a symlink" is jut looking at it it as it wasn't a symlink and directly reading the content of the file or the directory.

I'd assume the main problem with that in the npm ecosystem would be that ".." wouldn't refer to node_modules anymore after a link is resolved?

Exactly. This is what most build tools do by default and therefore fail.

I'd argue, correctness, opt-in compatibality, performance ;)

Correctness is a difficult term I think.

Compared to what most nodejs tools expect from the node_modules tree, building it using symlinks, is incorrect. Compared to what npm does, it's also incorrect. It depends on the perspective.

In another sense, everything that is not natively defined in nix is not correct as it doesn't specify what exactly is needed to build it.
With dream2nix we build a bridge from the incorrect world to the correct world, so in a sense we directly support incorrectness.

then
${config.deps.rsync}/bin/rsync -a --chmod=ug+w --delete ${nodeModulesDir}/ ./node_modules/
mkdir -p ./node_modules/.bin
for executablePath in ${binDir}/*; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and don't understand why it's necessary here, if it is?

checked if ln succeeds when a link with the same name is already present? (I think it fails and then you have to do many conditional stuff here)

That one alone could be solved with ln -sf :)

…ode_modules

This change allows to re-locate the node_modules directory via a single copy operation.

The current builder installs the .bin directory one level above the top-level package which isn't what we want for the node-modules output.

This change ensures that the .bin also exists within the local node_modules directory of the current package.

It also ensures that all symlinks are relative and don't point to store paths
@DavHau DavHau merged commit ff71cd8 into main Jul 4, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants