Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler-versions script: Compute supported compiler versions for a single package #632

Merged
merged 11 commits into from
Jul 21, 2023

Conversation

colinwahl
Copy link
Collaborator

@colinwahl colinwahl commented Jul 20, 2023

This PR is a first step towards #255. It adds a new script, compiler-versions, which will be used to compute the supported compiler versions for packages.

Currently the script can only work on a single package@version that has no dependencies. No attempt is made to modify the Metadata type.

In follow up PRs, I will start to introduce more sophisticated routines to get the supported compiler ranges for all existing packages, and eventually get to modifying the Manifest type.

Note: This currently computes a Range of supported compiler versions. In particular, since we use simple ranges, this means that once the package@version previously compiled for a lower compiler version and stops compiling for a later compiler, we stop checking to see if it becomes supported by a different compiler in the future. In general, I think this is fine, but wanted to call it out explicitly.

I've run this on various versions of prelude - here's an example of the final output:

Found supported compiler versions for prelude@6.0.0: >=0.15.0 <0.15.11

scripts/default.nix Outdated Show resolved Hide resolved
Copy link
Member

@thomashoneyman thomashoneyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits, otherwise looks great! Feel free to merge once addressed.

@f-f
Copy link
Member

f-f commented Jul 21, 2023

Note: This currently computes a Range of supported compiler versions. In particular, since we use simple ranges, this means that once the package@version previously compiled for a lower compiler version and stops compiling for a later compiler, we stop checking to see if it becomes supported by a different compiler in the future. In general, I think this is fine, but wanted to call it out explicitly.

Is this the behaviour we wish for? If a generally useful package works at some point, then it's broken by some compiler regression that is then fixed, do we not want to consider it compatible with the new series of compilers?

I would say we should do it the other way around (start checking from the most recent compiler versions rather than the older ones) if we want to keep it a single range, or just use a list of Ranges so we can cover everything.

@thomashoneyman
Copy link
Member

if we want to keep it a single range, or just use a list of ranges so we can cover everything

This is the big design question for the feature. Do we use single versions, like [ "0.13.0", "0.13.1" ], where every supported compiler is listed? Do we use ranges instead, like ">=0.13.0 <0.13.8"? Do we follow your suggestion and use a list of ranges, like [ ">=0.13.0 <0.13.8", ">=0.13.10 <0.13.12" ]?

The most correct thing to do is to list out every supported version, either via single versions or via ranges. The benefit is that we have accurately captured every supported compiler for every package. But there are downsides:

  1. This isn't compatible with using the solver to figure out what versions of dependencies to download based on what compilers they support. Right now we insert the compiler as a fake dependency into every package version based on what it's known to compile with (for example, prelude@6.0.0 gets a new dependency "purs": ">=0.15.0 <0.15.9"). Then, when solving a new package version, we can find all of its dependencies that are also valid with the indicated compiler. We need to do this to ensure we download dependencies that are compatible with the given compiler too, or else the solver will just download the latest versions regardless of compatibility and then we might get a false compilation failure.

  2. We'd no longer be able to set a cutoff for packages (such as "known not to compile from 0.14.0 onward), so when we release a new compiler we would have to compile every package version in the registry again (~13,000 versions). Realistically, most packages with no dependencies would fail right away (like prelude). Packages with dependencies only get tried if their dependencies indicate they support the compiler in question, so they'd be filtered out by the solver. But we're still looking at and attempting to solve 13,000 package versions, the vast majority of which is redundant work since we know that if a package version stopped compiling at 0.14.0 it's almost certainly not going to start working again at 0.15.9.

  3. (Weak) If we have a list of ranges or versions, then making the metadata files human readable via pretty-printing will cause these arrays to break over multiple lines, adding multiple lines to every version in a metadata file and ballooning the repo size. We might wish then to make metadata files not formatted, and just stringified instead, like how the manifest index is. Or switch to using dodo-printer so we can control this better.

In contrast, using a simple range is less correct — as you said, a compiler regression can cause a package version to falsely record that it doesn't work with some compilers that it does work with. But it is simple, and it allows us to be correct when compiling with the package version's dependencies, and in the vast majority of cases it will be correct too: when a package version stops compiling it almost certainly won't begin compiling again with new releases.

I am happy to use a list of ranges if we can figure out the solver issue. Also, we can migrate this discussion over to #255, as it affects more than this PR.

@thomashoneyman
Copy link
Member

thomashoneyman commented Jul 21, 2023

Oh — and I agree we should prioritize either a) producing the newest range of supported compilers or b) producing the largest range of supported compilers. For example, if you work with >=0.14.0 <0.15.0 and also >=0.15.4 <0.15.5 I feel like we should select the former range as your more "natural" range.

Maybe we could tweak things when we do the bulk version of this script so that we can see, in the registry today, how many packages would end up with multiple ranges if we chose to go that way. If there are quite a few then maybe we prioritize supporting a list of ranges. If there are very few, then maybe we just use a simple range.

@f-f
Copy link
Member

f-f commented Jul 21, 2023

Good analysis - note that I'm not advocating for a list of ranges, I'm just saying that in all the options that we have available to implement this feature (oldest working range, newest working range, list of ranges), then the current one (oldest working range) is the least correct, so we should really do something else.

While #255 is the general ticket for tracking this change, I do think that this is the right place for this discussion, as this simple assumption is the basis for further work.

@colinwahl
Copy link
Collaborator Author

Thank for the discussion - yeah, after reading through it, oldest working range is probably not what we want.

Ideally down the line we'd be able to compute whether or not each compiler version works for each package version - maybe for now I can change this script to just determine the full list of supported compiler versions for a single package, and as we look further into implementing the bulk check we can see if that's viable or if we want to go for a different approach with ranges.

When I first was implementing this script I was thinking it'd be nice to cutoff at some point - but the side effect is the range isn't really "correct". I suppose I was optimizing too early - at least for this script, computing the whole list is fine.

@thomashoneyman
Copy link
Member

I think that's the best approach for now, and we can always switch to list-of-ranges or simple range later if need be. At least this gives us the exact versions that are usable.

@colinwahl
Copy link
Collaborator Author

The latest commit changes the script to compute all supported compiler ranges for a single package version. Here's an example of the output:

Found supported compiler versions for prelude@6.0.0: 0.15.0, 0.15.2, 0.15.3, 0.15.4, 0.15.5, 0.15.6, 0.15.7, 0.15.8, 0.15.9, 0.15.10
Found supported compiler versions for prelude@3.0.0: 0.13.0, 0.13.2, 0.13.3, 0.13.4, 0.13.5, 0.13.6, 0.13.8, 0.14.0, 0.14.1, 0.14.2, 0.14.3, 0.14.4, 0.14.5, 0.14.6, 0.14.7, 0.14.8, 0.14.9

@thomashoneyman thomashoneyman merged commit 8839369 into master Jul 21, 2023
15 checks passed
@thomashoneyman thomashoneyman deleted the colin/compiler-versions-in-metadata branch July 21, 2023 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants