The SolrClient library is unmaintained, which is starting to cause
problems due to the moving Python ecosystem. Switching to requests
does not change my code in any meaningful way and makes maintenance
easier.
SolrClient works, but hasn't been updated in some time and this is
starting to cause issues with some of its dependencies (kazoo). We
can probably get by with using Python requests library and getting
JSON directly from Solr.
When building on sr.ht the default environment is the home directory
so we need to change to the source directory before trying to import
the SQL file.
I'm not sure how this will affect us, especially if we want to keep
support for DSpace 4, 5, and 6 in the same code base. At least the
REST API endpoint will have to change from an integer, our database
schema will have to change depending on whether the repository is
using IDs or UUIDs, and maybe even the Solr queries will change.
DSpace's stats-util script splits the Solr statistics core into yearly
shards. We need to use Solr's `shards` query parameter in order to get
the statistics for previous years. This commit adds a helper function
to enumerate the active Solr cores to find yearly shards matching the
statistics-YYYY pattern and add them to the query.
I think it might be possible to compute community and collection
statistics from Solr and make them available at new endpoints:
- /communities
- /community/id
- /collections
- /collection/id
When I originally created the pipenv environment I used the standard
pip requirements.txt that I already had, which captured all the mod-
ules and their exact versions at the time. This makes it hard to se-
parate the project's actual dependencies from the dependencies' dep-
endencies, complicating the Pipfile and making it hard to update mo-
dule versions later.
I've re-created the environment with the following commands:
$ pipenv install gunicorn falcon psycopg2-binary git+https://github.com/alanorth/SolrClient.git@kazoo-2.5.0#egg=SolrClient
$ pipenv install --dev ipython flake8 pytest