1
0
mirror of https://github.com/ilri/dspace-statistics-api.git synced 2025-02-03 17:02:53 +01:00
Alan Orth 385a34e5d0
indexer.py: Use psycopg2's execute_values to batch inserts
Batch inserts are much faster than a series of individual inserts
because they drastically reduce the overhead caused by round-trip
communication with the server. My tests in development confirm:

  - cursor.execute(): 19 seconds
  - execute_values(): 14 seconds

I'm currently only working with 4,500 rows, but I will experiment
with larger data sets, as well as larger batches. For example, on
the PostgreSQL mailing list a user reports doing 10,000 rows with
a page size of 100.

See: http://initd.org/psycopg/docs/extras.html#psycopg2.extras.execute_values
See: https://github.com/psycopg/psycopg2/issues/491#issuecomment-276551038
2018-09-26 23:10:29 +03:00
2018-09-25 00:56:01 +03:00
2018-09-25 12:17:20 +03:00
2018-09-25 00:49:47 +03:00
2018-09-18 14:16:07 +03:00
2018-09-23 13:24:30 +03:00

DSpace Statistics API

A quick and dirty REST API to expose Solr view and download statistics for items in a DSpace repository.

Written and tested in Python 3.5, 3.6, and 3.7. Requires PostgreSQL version 9.5 or greater for UPSERT support.

Installation

Create a virtual environment and run it:

$ python -m venv venv
$ . venv/bin/activate
$ pip install -r requirements.txt
$ gunicorn app:api

Using the API

The API exposes the following endpoints:

  • GET /itemsreturn views and downloads for all items that Solr knows about¹. Accepts limit and page query parameters for pagination of results.
  • GET /item/idreturn views and downloads for a single item (id must be a positive integer). Returns HTTP 404 if an item id is not found.

¹ We are querying the Solr statistics core, which technically only knows about items that have either views or downloads.

Todo

License

This work is licensed under the GPLv3.

Description
A simple REST API to expose Solr view and download statistics for items in a DSpace repository.
Readme 1.4 MiB
Languages
Python 100%