Automating releases for currently un-automated feedstock with custom recipe updates

I recently got involved with maintaining the git-annex feedstock, and we are looking to enable automatic creation of release PRs whenever an upstream release is made, something that’s currently lacking for the feedstock. The feedstock is unusual in the following ways:

  • The recipe builds two variants of git-annex: a “standard”/“alldeps” build that declares a number of dynamically-linked dependencies, and a “nodeps” build that simply redistributes a standalone statically-compiled binary for Linux. The resulting packages are distinguished by different build strings and by build numbers 100 apart.

  • The two variants are sourced from different URLs; the standard build is built from source files downloaded from Hackage (the Haskell package repository), while the nodeps build uses a pre-compiled binary distributed on archive.org.

    • The standard build also downloads GHC in parts from three additional URLs also listed under “source”. I don’t know why it does that.
  • As far as I can tell from looking at its source, while the regro-cf-autotick-bot can create PRs of new releases of software sourced from generic URLs, this only works when the upstream version is bumped by incrementing a single component by 1 and setting all components after it to 0. Unfortunately, this is not how git-annex versions work. Git-annex versions are of the form “<repository version>.<release date>”; e.g., version 10.20230828, which was followed by version 10.20230926.

  • Currently, whenever an upstream release is made, the feedstock is updated by manually running a Bash script that extracts git-annex’s version from its Hackage page, downloads & hashes the source files, and then updates recipe/meta.yaml to use the new version and hashes. Updating the values for the “nodeps” variant is more involved, as the archive.org URL is parameterized by file size and hash rather than by git-annex version, so the script downloads the standalone build from a fixed URL in order to compute these values; it also runs the standalone executable in order to get the exact version string the build declares itself as, for use in a test.

How can we get regro-cf-autotick-bot (or possibly another bot) to automatically create PRs in the feedstock whenever a new version of git-annex is released, and how can we ensure that all recipe parameters are updated appropriately in these PRs?

Thank you for the detailed context! I think your best chance here is submitting a PR to cf-scripts to add compatibility for this versioning scheme, and maybe opting in via a new bot setting in conda-forge.yml?

What sort of behavior do you recommend for the versioning scheme? If the scheme were to just check both <current major version>.<today in YYYYMMDD> and <current major version plus 1>.<today>, that would fail if the bot didn’t run by the end of the day after a release (to say nothing of timezone issues). There’s also the fact that the major version once jumped from 8 to 10, and looking through historical versions, I see some occurrences of two releases with the same date, the latter with .1 appended. Alternatively, I suppose the scheme could just query Hackage for the latest version, but I don’t feel familiar enough with Haskell packaging to make something suitable for all Haskell feedstocks.

Even if we got new version detection working, there’s still the fact that properly updating recipe/meta.yaml requires hashes of two different files, plus other information. Would adding code to regro-cf-autotick-bot to fill in all the details be an option, or would we need some sort of automation localized to our feedstock repository?

We don’t really support/allow localized automation (per feedstock workflows, essentially) because it could get out of hand pretty easily. We only have so many GHA runners for the whole org and we already have like… 20k feedstocks?

That versioning scheme seems a bit too involved to be generalized, but maybe there are ways to simplify it. For example, you could just discard the date info when doing comparisons. It doesn’t seem to convey useful information, update-wise.

If this is too specific for regro, maybe a workaround is to have an external workflow doing cron’d checks (every hour?) and handling the automated PR submission. I don’t know how easy is to reuse the regro bot code, but maybe it’s not that much work?