A simple REST API to expose Solr view and download statistics for items in a DSpace repository.
Go to file
Alan Orth a016916995
CHANGELOD.md: Add note about ujson
2018-10-24 14:15:03 +03:00
contrib contrib: Adjust example path 2018-10-23 14:34:29 +03:00
.gitignore Update docs to remove SQLite stuff 2018-09-25 00:56:01 +03:00
.travis.yml .travis.yml: Only build master branch 2018-10-14 19:00:31 +03:00
CHANGELOG.md CHANGELOD.md: Add note about ujson 2018-10-24 14:15:03 +03:00
LICENSE.txt Add GPLv3 license 2018-09-18 14:16:07 +03:00
README.md README.md: Add example nginx configuration 2018-10-23 14:55:36 +03:00
app.py app.py: Don't initialize Solr connection 2018-10-24 11:59:50 +03:00
config.py Use PostgreSQL instead of SQLite 2018-09-25 00:49:47 +03:00
database.py database.py: Use one line for psycopg2 imports 2018-09-26 22:23:24 +03:00
indexer.py Use Python's native json instead of ujson 2018-10-24 14:08:23 +03:00
requirements.txt Use Python's native json instead of ujson 2018-10-24 14:08:23 +03:00
solr.py Refactor Solr components 2018-09-23 13:24:30 +03:00

README.md

DSpace Statistics API Build Status

A simple REST API to expose Solr view and download statistics for items in a DSpace repository. This project contains a standalone indexing component and a WSGI application.

Requirements

Installation and Testing

Create a Python virtual environment and install the dependencies:

$ python -m venv venv
$ . venv/bin/activate
$ pip install -r requirements.txt

Set up the environment variables for Solr and PostgreSQL:

$ export SOLR_SERVER=http://localhost:8080/solr
$ export DATABASE_NAME=dspacestatistics
$ export DATABASE_USER=dspacestatistics
$ export DATABASE_PASS=dspacestatistics
$ export DATABASE_HOST=localhost

Index the Solr statistics core to populate the PostgreSQL database:

$ ./indexer.py

Run the REST API:

$ gunicorn app:api

Test to see if there are any statistics:

$ curl 'http://localhost:8000/items?limit=1'

Deployment

There are example systemd service and timer units in the contrib directory. The API service listens on localhost by default so you will need to expose it publicly using a web server like nginx.

An example nginx configuration is:

server {
    #...

    location ~ /rest/statistics/?(.*) {
        access_log /var/log/nginx/statistics.log;
        proxy_pass http://statistics_api/$1$is_args$args;
    }
}

upstream statistics_api {
    server 127.0.0.1:5000;
}

This would expose the API at /rest/statistics.

Using the API

The API exposes the following endpoints:

  • GET /itemsreturn views and downloads for all items that Solr knows about¹. Accepts limit and page query parameters for pagination of results.
  • GET /item/idreturn views and downloads for a single item (id must be a positive integer). Returns HTTP 404 if an item id is not found.

¹ We are querying the Solr statistics core, which technically only knows about items that have either views or downloads.

Todo

  • Add API documentation
  • Close DB connection when gunicorn shuts down gracefully
  • Better logging
  • Tests
  • Check if database exists (try/except)
  • Version API
  • Use JSON in PostgreSQL
  • Switch to Python 3.6+ f-string syntax

License

This work is licensed under the GPLv3.