1
0
mirror of https://github.com/ilri/dspace-statistics-api.git synced 2025-05-10 15:16:02 +02:00

Compare commits

...

26 Commits

Author SHA1 Message Date
46cfc3ffbc CHANGELOG.md: Release version 0.3.2 2018-09-25 13:14:08 +03:00
2850035a4c Return HTTP 404 when an item id is not found 2018-09-25 13:12:53 +03:00
c0b550109a README.md: Improve wording 2018-09-25 12:24:52 +03:00
bfceffd84d indexer.py: Improve inline documentation 2018-09-25 12:23:31 +03:00
d0552f5047 CHANGELOG.md: Move unreleased changes to version 0.3.1 2018-09-25 12:18:26 +03:00
c3a0bf7f44 CHANGELOG.md: Add Python 3.7 to Travis CI config 2018-09-25 12:17:49 +03:00
6e47e9c9ee .travis.yml: Add Python 3.7 2018-09-25 12:17:20 +03:00
cd90d618d6 CHANGELOG.md: Fix error in old release 2018-09-25 12:17:01 +03:00
280d211d56 CHANGELOG.md: Add note about kazoo 2.5.0 2018-09-25 12:12:10 +03:00
806d63137f requirements.txt: Use kazoo 2.5.0
SolrClient 0.2.1 currently depends on kazoo 2.2.1, but there is an
issue with Python 3.7 in kazoo <= 2.5.0. Kazoo 2.5.0 fixes the is-
sue with Python 3.7, and for my limited usage of SolrClient it se-
ems to work fine.

See: https://github.com/moonlitesolutions/SolrClient/issues/79
2018-09-25 12:08:28 +03:00
f7c7390e4f README.md: Add note about Python 3.7 2018-09-25 12:07:58 +03:00
702724e8a4 CHANGELOG.md: Move unreleased changes to version 0.3.0 2018-09-25 11:38:36 +03:00
36818d03ef CHANGELOG.md: Update unreleased changes 2018-09-25 11:37:56 +03:00
4cf8656b35 Change / route to /items
I think it's more obvious if the "all items" route is plural. Also,
this will allow me to eventually put documentation at the root.
2018-09-25 11:34:07 +03:00
f30a464cd1 README.md: Add notes about API endpoints 2018-09-25 11:28:12 +03:00
93ae12e313 README.md: Update introduction 2018-09-25 11:15:12 +03:00
dc978e9333 CHANGELOG.md: Add note about requirements.txt and Travis CI 2018-09-25 11:09:02 +03:00
295436fea0 Add .travis.yml 2018-09-25 11:08:01 +03:00
46a1476ab0 Add requirements.txt
Generated with `pip freeze`. This is so I can pin the versions of
packages that I've tested with as well as to allow Travis to test
whether the project runs on various Pythons and to let GitHub in-
form me of vulnerabilities in some libraries.
2018-09-25 11:02:50 +03:00
87dbb6c4df CHANGELOG.md: Release version 0.2.1 2018-09-25 02:21:44 +03:00
3160c44566 app.py: Remove comment
This comment was added when I first began the application and the
testing status is documented in the README now.
2018-09-25 02:20:51 +03:00
4b72f626d9 Update string substitution format
Instead of doing numbered strings I will just depend on the order,
at least to be consistent.
2018-09-25 02:19:29 +03:00
2d3b7620e3 CHANGELOG.md: Add note about psycopg2.extras.DictCursor 2018-09-25 02:08:54 +03:00
6e4bc630f7 database.py: Use psycopg2.extras.DictCursor
This allows us to access records using their column name. I didn't
notice that this was not working, as I had been testing the wrong
server!

See: http://initd.org/psycopg/docs/extras.html
2018-09-25 02:06:29 +03:00
44884140e5 CHANGELOG.md: Add new unreleased changes 2018-09-25 01:11:37 +03:00
74ff86ee3b contrib: Update environment settings in system units 2018-09-25 01:10:14 +03:00
9 changed files with 91 additions and 30 deletions

9
.travis.yml Normal file
View File

@ -0,0 +1,9 @@
language: python
python:
- "3.5"
- "3.6"
- "3.7"
install:
- pip install -r requirements.txt
# vim: ts=2 sw=2 et

View File

@ -4,6 +4,29 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.3.2] - 2018-09-25
## Changed
- /item/id route now returns HTTP 404 if an item is not found
## [0.3.1] - 2018-09-25
### Changed
- Force SolrClient's kazoo dependency to version 2.5.0 to work with Python 3.7
- Add Python 3.7 to Travis CI configuration
## [0.3.0] - 2018-09-25
### Added
- requirements.txt for pip
- Travis CI build configuration for Python 3.5 and 3.6
- Documentation on using the API
### Changed
- The "all items" route from / to /items
## [0.2.1] - 2018-09-24
### Changed
- Environment settings in example systemd unit files
- Use psycopg2.extras.DictCursor for PostgreSQL connection
## [0.2.0] - 2018-09-24
### Changed
- Use PostgreSQL instead of SQLite because UPSERT support needs a very new libsqlite3 whereas it's already in PostgreSQL 9.5+

View File

@ -1,22 +1,30 @@
# DSpace Statistics API
A quick and dirty REST API to expose Solr view and download statistics for items in a DSpace repository.
Written and tested in Python 3.6. SolrClient (0.2.1) does not currently run in Python 3.7.0. Requires PostgreSQL version 9.5 or greater for [`UPSERT` support](https://wiki.postgresql.org/wiki/UPSERT).
Written and tested in Python 3.5, 3.6, and 3.7. Requires PostgreSQL version 9.5 or greater for [`UPSERT` support](https://wiki.postgresql.org/wiki/UPSERT).
## Installation
Create a virtual environment and run it:
$ virtualenv -p /usr/bin/python3.6 venv
$ python -m venv venv
$ . venv/bin/activate
$ pip install falcon gunicorn SolrClient psycopg2-binary
$ pip install -r requirements.txt
$ gunicorn app:api
## Using the API
The API exposes the following endpoints:
- GET `/items`return views and downloads for all items that Solr knows about¹. Accepts `limit` and `page` query parameters for pagination of results.
- GET `/item/id`return views and downloads for a single item (*id* must be a positive integer). Returns HTTP 404 if an item id is not found.
¹ We are querying the Solr statistics core, which technically only knows about items that have either views or downloads.
## Todo
- Add API documentation
- Close up DB connection when gunicorn shuts down gracefully
- Better logging
- Return HTTP 404 when item_id is nonexistent
- Tests
## License
This work is licensed under the [GPLv3](https://www.gnu.org/licenses/gpl-3.0.en.html).

35
app.py
View File

@ -1,7 +1,3 @@
# Tested with Python 3.6
# See DSpace Solr docs for tips about parameters
# https://wiki.duraspace.org/display/DSPACE/Solr
from database import database_connection
import falcon
from solr import solr_connection
@ -25,7 +21,7 @@ class AllItemsResource:
pages = round(cursor.fetchone()[0] / limit)
# get statistics, ordered by id, and use limit and offset to page through results
cursor.execute('SELECT id, views, downloads FROM items ORDER BY id ASC LIMIT {0} OFFSET {1}'.format(limit, offset))
cursor.execute('SELECT id, views, downloads FROM items ORDER BY id ASC LIMIT {} OFFSET {}'.format(limit, offset))
results = cursor.fetchmany(limit)
cursor.close()
@ -50,20 +46,27 @@ class ItemResource:
"""Handles GET requests"""
cursor = db.cursor()
cursor.execute('SELECT views, downloads FROM items WHERE id={0}'.format(item_id))
results = cursor.fetchone()
cursor.execute('SELECT views, downloads FROM items WHERE id={}'.format(item_id))
if cursor.rowcount == 0:
raise falcon.HTTPNotFound(
title='Item not found',
description='The item with id "{}" was not found.'.format(item_id)
)
else:
results = cursor.fetchone()
statistics = {
'id': item_id,
'views': results['views'],
'downloads': results['downloads']
}
resp.media = statistics
cursor.close()
statistics = {
'id': item_id,
'views': results['views'],
'downloads': results['downloads']
}
resp.media = statistics
api = falcon.API()
api.add_route('/', AllItemsResource())
api.add_route('/items', AllItemsResource())
api.add_route('/item/{item_id:int}', ItemResource())
# vim: set sw=4 ts=4 expandtab:

View File

@ -3,7 +3,10 @@ Description=DSpace Statistics API
After=network.target
[Service]
Environment=SOLR_SERVER=http://localhost:8081/solr
Environment=DATABASE_NAME=dspacestatistics
Environment=DATABASE_USER=dspacestatistics
Environment=DATABASE_PASS=dspacestatistics
Environment=DATABASE_HOST=localhost
User=nobody
Group=nogroup
WorkingDirectory=/opt/ilri/dspace-statistics-api

View File

@ -4,6 +4,10 @@ After=tomcat7.target
[Service]
Environment=SOLR_SERVER=http://localhost:8081/solr
Environment=DATABASE_NAME=dspacestatistics
Environment=DATABASE_USER=dspacestatistics
Environment=DATABASE_PASS=dspacestatistics
Environment=DATABASE_HOST=localhost
User=nobody
Group=nogroup
WorkingDirectory=/opt/ilri/dspace-statistics-api

View File

@ -3,9 +3,10 @@ from config import DATABASE_USER
from config import DATABASE_PASS
from config import DATABASE_HOST
import psycopg2
import psycopg2.extras
def database_connection():
connection = psycopg2.connect("dbname={} user={} password={} host='{}'".format(DATABASE_NAME, DATABASE_USER, DATABASE_PASS, DATABASE_HOST))
connection = psycopg2.connect("dbname={} user={} password={} host='{}'".format(DATABASE_NAME, DATABASE_USER, DATABASE_PASS, DATABASE_HOST), cursor_factory=psycopg2.extras.DictCursor)
return connection

View File

@ -20,17 +20,15 @@
# ---
#
# Connects to a DSpace Solr statistics core and ingests item views and downloads
# into a Postgres database for use with other applications (an API, for example).
# into a PostgreSQL database for use by other applications (like an API).
#
# This script is written for Python 3 and requires several modules that you can
# install with pip (I recommend setting up a Python virtual environment first):
# This script is written for Python 3.5+ and requires several modules that you
# can install with pip (I recommend using a Python virtual environment):
#
# $ pip install SolrClient
# $ pip install SolrClient psycopg2-binary
#
# See: https://solrclient.readthedocs.io/en/latest/SolrClient.html
# See: https://wiki.duraspace.org/display/DSPACE/Solr
#
# Tested with Python 3.5 and 3.6.
from database import database_connection
from solr import solr_connection
@ -55,7 +53,7 @@ def index_views():
cursor = db.cursor()
while results_current_page <= results_num_pages:
print('Page {0} of {1}.'.format(results_current_page, results_num_pages))
print('Page {} of {}.'.format(results_current_page, results_num_pages))
res = solr.query('statistics', {
'q':'type:2',
@ -102,7 +100,7 @@ def index_downloads():
cursor = db.cursor()
while results_current_page <= results_num_pages:
print('Page {0} of {1}.'.format(results_current_page, results_num_pages))
print('Page {} of {}.'.format(results_current_page, results_num_pages))
res = solr.query('statistics', {
'q':'type:0',

12
requirements.txt Normal file
View File

@ -0,0 +1,12 @@
certifi==2018.8.24
chardet==3.0.4
falcon==1.4.1
gunicorn==19.9.0
idna==2.7
kazoo==2.5.0
psycopg2-binary==2.7.5
python-mimeparse==1.6.0
requests==2.19.1
six==1.11.0
SolrClient==0.2.1
urllib3==1.23