I tend to agree here – but for now, I’d like to discuss what best practices are with the current technology.
Is “We” in this case conda-forge? Is that written down somewhere?
Even if so, what “batteries” should be included – I see two distinct kinds of extra dependencies:
-
optional features: e.g. dask.distributed – ou can use dask without it, but there are some features you can’t use. I can certainly see an argument for including this kind of thing.
-
Other packages used for demos, tutorials, etc – e.g matplotlib or Jupyter for a package that is not about plotting – of course those are usual for demoing, etc, but not required for the core functionality.
Personally, it’s (2) that I object to – that tends to being in a sometimes a very large dependency stack that it utterly irrelevant to actually using the package. The assumption that everyone that wants some computation feature is doing interactive data analysis with the full SciPy stack is just plain wrong.
– and if this package is not at the top of the requirements stack, it’s VERY hard to avoid bringing that stuff in – you’d essentially have to turn off conda’s dependency resolution. And if the extra requirments are pinned at all, that could really create a challenge.
NOTE: MPL eventually addressed some of this within its stack by creating a matplotlib-base package, so you wouldn’t have to bring in all the back-ends if you didn’t want to. I think that should be recommended policy.
Actually, I think recommended policy should be for the “standard” package to be minimal, and either:
-Document that if you wan to run the demos you are going to need, these packages. You can provide a demos_requirements.txt file if you want.
- Offer another package that provided every likley related dependency.
- @dhirschfeld is right – this shoudln’t be necceasy, but it is right now.
Honestly, I thikn folks are including everything not because they really think it’s best, but because they haven’t thought carefully about it. In particular, the full stack is what developers need to work on the documentation and demos, etc – why wouldn’t they include them?
Anyway, at the end of the day, I’ll have to live with what the conda-forge communities’ consensus on this is, but I thikn that shold be documented.
In some of my work, I have three (or four!) sets of requirments:
requirements_run
requirements_develop
requirements_test
requirements_docs
(I could probably merge dev and docs without anyone complaining, but …)
It can be a bit annoying, but it does let us build lean environments.