mirror of
https://github.com/ilri/dspace-statistics-api.git
synced 2024-11-22 06:15:02 +01:00
A simple REST API to expose Solr view and download statistics for items in a DSpace repository.
Alan Orth
385a34e5d0
Batch inserts are much faster than a series of individual inserts because they drastically reduce the overhead caused by round-trip communication with the server. My tests in development confirm: - cursor.execute(): 19 seconds - execute_values(): 14 seconds I'm currently only working with 4,500 rows, but I will experiment with larger data sets, as well as larger batches. For example, on the PostgreSQL mailing list a user reports doing 10,000 rows with a page size of 100. See: http://initd.org/psycopg/docs/extras.html#psycopg2.extras.execute_values See: https://github.com/psycopg/psycopg2/issues/491#issuecomment-276551038 |
||
---|---|---|
contrib | ||
.gitignore | ||
.travis.yml | ||
app.py | ||
CHANGELOG.md | ||
config.py | ||
database.py | ||
indexer.py | ||
LICENSE.txt | ||
README.md | ||
requirements.txt | ||
solr.py |
DSpace Statistics API
A quick and dirty REST API to expose Solr view and download statistics for items in a DSpace repository.
Written and tested in Python 3.5, 3.6, and 3.7. Requires PostgreSQL version 9.5 or greater for UPSERT
support.
Installation
Create a virtual environment and run it:
$ python -m venv venv
$ . venv/bin/activate
$ pip install -r requirements.txt
$ gunicorn app:api
Using the API
The API exposes the following endpoints:
- GET
/items
— return views and downloads for all items that Solr knows about¹. Acceptslimit
andpage
query parameters for pagination of results. - GET
/item/id
— return views and downloads for a single item (id must be a positive integer). Returns HTTP 404 if an item id is not found.
¹ We are querying the Solr statistics core, which technically only knows about items that have either views or downloads.
Todo
- Add API documentation
- Close up DB connection when gunicorn shuts down gracefully
- Better logging
- Tests
- Use batch inserts to the DB)
License
This work is licensed under the GPLv3.