1
0
mirror of https://github.com/ilri/dspace-statistics-api.git synced 2024-12-22 12:42:19 +01:00
A simple REST API to expose Solr view and download statistics for items in a DSpace repository.
Go to file
Alan Orth 385a34e5d0
indexer.py: Use psycopg2's execute_values to batch inserts
Batch inserts are much faster than a series of individual inserts
because they drastically reduce the overhead caused by round-trip
communication with the server. My tests in development confirm:

  - cursor.execute(): 19 seconds
  - execute_values(): 14 seconds

I'm currently only working with 4,500 rows, but I will experiment
with larger data sets, as well as larger batches. For example, on
the PostgreSQL mailing list a user reports doing 10,000 rows with
a page size of 100.

See: http://initd.org/psycopg/docs/extras.html#psycopg2.extras.execute_values
See: https://github.com/psycopg/psycopg2/issues/491#issuecomment-276551038
2018-09-26 23:10:29 +03:00
contrib contrib/dspace-statistics-indexer.timer: Fix syntax 2018-09-25 23:07:03 +03:00
.gitignore Update docs to remove SQLite stuff 2018-09-25 00:56:01 +03:00
.travis.yml .travis.yml: Add Python 3.7 2018-09-25 12:17:20 +03:00
app.py Return HTTP 404 when an item id is not found 2018-09-25 13:12:53 +03:00
CHANGELOG.md CHANGELOG.md: Move unreleased changes to version 0.4.0 2018-09-26 02:51:27 +03:00
config.py Use PostgreSQL instead of SQLite 2018-09-25 00:49:47 +03:00
database.py database.py: Use one line for psycopg2 imports 2018-09-26 22:23:24 +03:00
indexer.py indexer.py: Use psycopg2's execute_values to batch inserts 2018-09-26 23:10:29 +03:00
LICENSE.txt Add GPLv3 license 2018-09-18 14:16:07 +03:00
README.md README.md: Add link to psycopg2 issue about batch inserts 2018-09-26 22:23:08 +03:00
requirements.txt requirements.txt: Use kazoo 2.5.0 2018-09-25 12:08:28 +03:00
solr.py Refactor Solr components 2018-09-23 13:24:30 +03:00

DSpace Statistics API

A quick and dirty REST API to expose Solr view and download statistics for items in a DSpace repository.

Written and tested in Python 3.5, 3.6, and 3.7. Requires PostgreSQL version 9.5 or greater for UPSERT support.

Installation

Create a virtual environment and run it:

$ python -m venv venv
$ . venv/bin/activate
$ pip install -r requirements.txt
$ gunicorn app:api

Using the API

The API exposes the following endpoints:

  • GET /itemsreturn views and downloads for all items that Solr knows about¹. Accepts limit and page query parameters for pagination of results.
  • GET /item/idreturn views and downloads for a single item (id must be a positive integer). Returns HTTP 404 if an item id is not found.

¹ We are querying the Solr statistics core, which technically only knows about items that have either views or downloads.

Todo

License

This work is licensed under the GPLv3.