A simple REST API to expose Solr view and download statistics for items in a DSpace repository.
Go to file
Alan Orth d36be5ee50 contrib: Update systemd unit files for refactor 2018-10-28 11:14:21 +02:00
contrib contrib: Update systemd unit files for refactor 2018-10-28 11:14:21 +02:00
dspace_statistics_api dspace_statistics_api/app.py: remove unused code 2018-10-28 11:14:21 +02:00
.gitignore Update docs to remove SQLite stuff 2018-09-25 00:56:01 +03:00
.travis.yml .travis.yml: Only build master branch 2018-10-14 19:00:31 +03:00
CHANGELOG.md CHANGELOG.md: Add changes for version 0.5.2 2018-10-28 11:12:27 +02:00
LICENSE.txt Add GPLv3 license 2018-09-18 14:16:07 +03:00
README.md Add "application" alias to API object 2018-10-28 11:14:21 +02:00
requirements.txt requirements.txt: Update libraries 2018-10-28 11:09:47 +02:00

README.md

DSpace Statistics API Build Status

A simple REST API to expose Solr view and download statistics for items in a DSpace repository. This project contains a standalone indexing component and a WSGI application.

Requirements

Installation and Testing

Create a Python virtual environment and install the dependencies:

$ python -m venv venv
$ . venv/bin/activate
$ pip install -r requirements.txt

Set up the environment variables for Solr and PostgreSQL:

$ export SOLR_SERVER=http://localhost:8080/solr
$ export DATABASE_NAME=dspacestatistics
$ export DATABASE_USER=dspacestatistics
$ export DATABASE_PASS=dspacestatistics
$ export DATABASE_HOST=localhost

Index the Solr statistics core to populate the PostgreSQL database:

$ python -m dspace_statistics_api.indexer

Run the REST API:

$ gunicorn dspace_statistics_api.app

Test to see if there are any statistics:

$ curl 'http://localhost:8000/items?limit=1'

Deployment

There are example systemd service and timer units in the contrib directory. The API service listens on localhost by default so you will need to expose it publicly using a web server like nginx.

An example nginx configuration is:

server {
    #...

    location ~ /rest/statistics/?(.*) {
        access_log /var/log/nginx/statistics.log;
        proxy_pass http://statistics_api/$1$is_args$args;
    }
}

upstream statistics_api {
    server 127.0.0.1:5000;
}

This would expose the API at /rest/statistics.

Using the API

The API exposes the following endpoints:

  • GET /itemsreturn views and downloads for all items that Solr knows about¹. Accepts limit and page query parameters for pagination of results.
  • GET /item/idreturn views and downloads for a single item (id must be a positive integer). Returns HTTP 404 if an item id is not found.

¹ We are querying the Solr statistics core, which technically only knows about items that have either views or downloads.

Todo

  • Add API documentation
  • Close DB connection when gunicorn shuts down gracefully
  • Better logging
  • Tests
  • Check if database exists (try/except)
  • Version API
  • Use JSON in PostgreSQL
  • Switch to Python 3.6+ f-string syntax

License

This work is licensed under the GPLv3.