New cache format

We want to add a <hash>.state.json to conda’s cache, to replace the current “prepend three fields to the beginning of a 200MB json” format. Enables the cached repodata.json to be identical to upstream. <hash> would be the same as the previous cache since a few programs expect this. It would look like this. mtime is an addition - check that the mtime of <hash>.json matches the one inside <hash>.state.json to detect older clients. The file may contain arbitrary data to support incremental repodata e.g.

 "_url": "",
 "_etag": "W/\"cdee7221e6860fafe36bc78789d636be\"",
 "_mod": "Fri, 11 Nov 2022 23:28:04 GMT",
 "_cache_control": "public, max-age=30",
 "mtime": 1668259353.941853

Medium-term we should add a command to show the cache, for interested applications, so that no program depends on conda’s specific cache format.


That looks very good to me and much better than the current “insert data into json” hack!

We will have the problem of incremental/non-incremental cache clients overwriting each other. When you switch to using the jlap file to download deltas from the last complete repodata.json, the etag / mod will have to come from the jlap file (since no new request was made for the full json, and we will want to replace the unmodified repodata.json with a patched version). A client that doesn’t expect the state file, and expects _mod and _etag to be in-line, will also think the cache is outdated and download repodata.json again. These clients could be run in “offline - don’t re-download repodata.json” mode to avoid this.

I assume that the problem will not be noticeable if most users type conda install less frequently than the remote repodata.json updates, since they would have had to download a fresh one either way.

@wolv did I hear correctly that mamba has added locking on its caches for concurrent runs? Where does that happen? Oh, looks like Changed LockFile to be a non-throwing checkable type, no pointers use. · mamba-org/mamba@2b7b230 · GitHub is a good starting point. Also exposed to Python in mamba/__init__.pyi at main · mamba-org/mamba · GitHub

Thanks for writing this CEP where we are working on the details of the format. initial cep for repodata state by wolfv · Pull Request #46 · conda-incubator/ceps · GitHub