-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: validate: don't fail-fast #10291
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's turn it on and see how it goes.
I do have a bit of a worry about this: while odd GitHub Actions failures are common, so are (especially, early in development) actual problems where, if one job fails, the others will as well (possibly after another 45 minutes, as with the Mac jobs). |
I think this PR implements
rather than
Perhaps the commit can be updated with the description about what the commit is intended to achieve? |
fe11422
to
44950ea
Compare
Indeed, thank you for catching it. I updated the commit message to say:
Now, the sequential failures need another approach. Quick googling revealed that we'll have to add several if: success() || failure() to every step that we want to run irrespective of the success of the previous steps. @geekosaur I don't understand your concern, could you, perhaps, elaborate? You can always cancel the job manually if you feel like it. Or push an update, and GitHub restarts the whole thing for you. I fail to see how that 45-minutes MacOS job hurts anything. It may be pointless, I agree, but if it doesn't hurt, we have a net gain, because of the other type of failures (spurious Windows failures, as you say). |
I think the point is that there will be a lot of redundant work performed on each MR if there is a failure on one of the jobs. I don't know if this will cause any issues or not, I don't know how much compute time each project is allowed. Overall the CI workload for cabal seems quite high to me (compared to GHC). On GHC there are 5 validation jobs (5 different platforms) which run on each MR, then there is a more stringent pipeline which runs on each batch of MRs before they are merged together. I imagine this might be quite difficult to engineer with github. @ulysses4ever Perhaps you can combine all the different steps together and just have one "test" step, which works by one call to the validation script? |
I don't think it's a good idea. There's a reason they were split. Maybe several ones. For instance, some steps have to be blocked on specific platforms / compiler versions occasionally: this happened several times during my involvement with this (2-3 years) with MacOS and Windows. Another reason is pure UI. I think it's much easier to see where the problem lies when you have some level of granularity in your test process. We have a bunch of ugly bash to print headers and stuff but I like GitHUb's UI for "steps" much more. Granted, this is subjective.
AFAIK we use GitHub cloud and nothing else. I don't know why we should care about their resources. But if you think that this change is not helpful, we can close this PR — after all, you started this discussion, so if this PR doesn't implement what you intended, perhaps, it doesn't achieve anything. The reasons I did it are:
|
If nothing else, because it slows down any other jobs we (and anyone else) have running: GitHub doesn't give unlimited resources to free users. |
@geekosaur can I get a reference to GitHub documentation explaining the alleged showdown? |
The basics are here. The rest should be obvious: GitHub does not have an infinite number of machines, therefore the more concurrent actions that are running, the more load is on those machines and the longer it takes for actions to finish. |
@geekosaur we were granted a higher limit than that, I think it's enough to sustain this change |
Which means that if a Windows job fails, all other jobs in the matrix will be allowed to finish (other platforms, as well as other compilers on Windows, etc.) Inspired by the discussion at #10263
44950ea
to
f0e0985
Compare
Oh, I didn't notice that a merge label was applied earlier... I didn't mean to merge it before discussions are fully resolved. Sorry everyone! |
I think we determined that Mergify doesn't consider discussion to be changes to the PR, likely because GitHub considers PR comments to be distinct from the PR itself. We need to be careful about that. |
Negative reviews will block instead, so don't hesitate to write one if necessary. |
@mergify backport 3.12 |
✅ Backports have been created
|
CI: validate: don't fail-fast (backport #10291)
See discussion at #10263
@geekosaur do we want to backport to 3.12?
Template B: This PR does not modify behaviour or interface
E.g. the PR only touches documentation or tests, does refactorings, etc.
Include the following checklist in your PR: