dspace-statistics-api

mirror of https://github.com/ilri/dspace-statistics-api.git synced 2024-11-15 19:07:03 +01:00

Author	SHA1	Message	Date
Alan Orth	a35ecf2394	Add Swagger UI on /swagger This includes a Swagger UI with an OpenAPI 3.0 JSON schema for easy interactive demonstration and testing of the API. The JSON schema was created with the standalone swagger-editor. Includes tests to make sure that the /swagger and /docs/openapi.json paths are acce- ssible.	2020-12-22 11:18:47 +02:00
Alan Orth	830e4415f5	dspace_statistics_api/app.py: Run isort	2020-12-20 16:29:35 +02:00
Alan Orth	47b4eb3df7	Rename items.py to stats.py It is no longer used only for item-related statistics functions.	2020-12-20 16:28:56 +02:00
Alan Orth	3339bf8d9c	Add communities and collections support to API The basic logic is similar to items, where you can request single item statistics with a UUID, all item statistics, and item statis- tics for a list of items (optionally with a date range). Most of the item code was re-purposed to work on "elements", which can be items, communities, or collections depending on the request, with the use of Falcon's `before` hooks to set the statistics scope so we know how to behave for the current request. Other than the minor difference in facet fields, another issue I had with communities and collections is that the owningComm and owningColl fields are multi-valued (unlike items' id field). This means that, when you facet the results of your query, Solr returns ids that seem unrelated, but are actually present in the field, so I had to make sure I checked all returned ids to see if they were in the user's POSTed elements list. TODO: - Add tests - Revise docstrings - Refactor items.py as it is now generic	2020-12-20 16:14:46 +02:00
Alan Orth	4dbf734a4b	Move all imports to top of file A few months ago I had an issue setting up mocking because I was trying to be clever importing these libraries only when I needed them rather than at the global scope. Someone pointed out to me that if the imports are at the top of the file Falcon will load them once when the WSGI server starts, whereas if they are in the on_get() or on_post() they will load for every request! Also, it seems that PEP8 recommends keeping imports at the top of the file anyways, so I will just do that. Imports sorted with isort. See: https://www.python.org/dev/peps/pep-0008/#imports	2020-12-18 22:42:06 +02:00
Alan Orth	4590fc8708	dspace_statistics_api/app.py: Use ORDER BY in /items Since we are paging through the results by limit/offset we need to be sure that we are returning results deterministically.	2020-12-17 10:10:40 +02:00
Alan Orth	ca1582a8b6	Make sure limit is between 1 and 100 We were not properly checking whether the limit was greater than 0 in all cases.	2020-11-02 21:59:20 +02:00
Alan Orth	5acd927210	dspace_statistics_api: Sort imports with isort	2020-10-06 15:12:13 +03:00
Alan Orth	d4518d62ad	dspace_statistics_api/app.py: Refactor for testability I thought it was clever to only import these in the on_post handler because they aren't needed elsewhere, but it turns out that this is not a common pattern and even causes problems with testability. First, if the imports are at the top of the file as PEP8 recommends, then the WSGI server will import them once when it loads the app and they remain in memory for the lifecycle of the app. If the imports are in the on_post handler they would be re-imported on every request! Second, this pattern of importing in a method makes it tricky to use object patching in mocks. See: https://www.python.org/dev/peps/pep-0008/#imports	2020-10-05 20:43:50 +03:00
Alan Orth	5a53b57b3b	Refactor `/items` POST handler to use a before hook This allows us to do the dirty work of parsing, validating, and setting local variables from the POST parameters outside of the on_post function. We then share the parameters via the req.context object. Functionally it is the same, but readability is better and it's a neat trick that I could use elsewhere. See: https://falcon.readthedocs.io/en/stable/user/faq.html#how-can-i-pass-data-from-a-hook-to-a-responder-and-between-hooks	2020-09-26 18:40:52 +03:00
Alan Orth	946f0749e2	dspace_statistics_api/app.py: Use bounded_stream in on_post For reasons I don't quite understand, we need to use bounded_stream in the on_post request handler in order to use simulate_post() with the testing client in Falcon 2.0.0. Normal runtime operation via gunicorn does not have any issues with stream. See: https://github.com/falconry/falcon/issues/1720 See: https://github.com/falconry/falcon/issues/1554	2020-09-26 11:50:57 +03:00
Alan Orth	73c71fa8a0	dspace_statistics_api: Add support for date ranges to /items You can now POST a JSON request to /items with a list of items and a date range. This allows the possibility to get view and download statistics for arbitrary items and arbitrary date ranges. The JSON request should be in the following format: { "limit": 100, "page": 0, "dateFrom": "2020-01-01T00:00:00Z", "dateTo": "2020-09-09T00:00:00Z", "items": [ "f44cf173-2344-4eb2-8f00-ee55df32c76f", "2324aa41-e9de-4a2b-bc36-16241464683e", "8542f9da-9ce1-4614-abf4-f2e3fdb4b305", "0fe573e7-042a-4240-a4d9-753b61233908" ] } The limit, page, and date parameters are all optional. By default it will use a limit of 100, page 0, and [* TO *] Solr date range.	2020-09-25 12:21:11 +03:00
Alan Orth	0ef071a91d	dspace_statistics_api: Use f-strings instead of format() We had previously been avoiding the f-strings because we needed to run on Python 3.5 and they were only available in Python 3.6+, but now the black formatter requires Python 3.6 and all our systems are running Python 3.6+ anyways.	2020-03-02 11:24:29 +02:00
Alan Orth	9e7dd28156	dspace_statistics_api/app.py: Use parameterized SQL queries This is a better way to run SQL queries because psycopg2 takes care of the quoting for us.	2020-03-02 11:16:05 +02:00
Alan Orth	5955868b9a	dspace_statistics_api/app.py: Use UUID DSpace 6+ uses a UUID for item identifiers instead of an integer so we need to adapt our PostgreSQL queries to use those. Note that we can no longer sort results in the "all items" endpoint by ID. Also, we need to use parameterized psycopg2 queries instead of strings to support queries with UUIDs properly. To use the Python UUID objects elsewhere in the code we need to make sure that we cast them to str.	2020-03-02 11:06:48 +02:00
Alan Orth	cb3c3d37fa	Sort imports with isort	2019-11-27 12:31:04 +02:00
Alan Orth	4ff1fd4a22	Format code with black	2019-11-27 12:30:06 +02:00
Alan Orth	5a3b392a1d	dspace_statistics_api/app.py: Fix Falcon 2.0 syntax See: dspace_statistics_api/app.py	2019-04-18 09:57:18 +03:00
Alan Orth	2f342be948	Refactor database code to use a context manager Instead of opening one global persistent database connection when the application I am now abstracting it to a class that I can use in combination with Python's "with" context. Both connections and cursors are kept for the context of each "with" block and closed automatically when exiting. See: https://alysivji.github.io/managing-resources-with-context-managers-pythonic.html See: http://initd.org/psycopg/docs/connection.html#connection.close	2018-11-07 17:41:21 +02:00
Alan Orth	cc5ce3ab98	Correct issues highlighted by Flake8 Flake8 validates code style against PEP 8 in order to encourage the writing of idiomatic Python. For reference, I am currently ignoring errors about line length (E501) because I feel it makes code harder to read. This is the invocation I am using: $ flake8 --ignore E501 dspace_statistics_api	2018-11-04 00:04:27 +02:00
Alan Orth	30dc7f1939	Add basic API documentation on root (/) I had imagined plugging in an interactive Swagger or OpenAPI instance here, but that's actually much more involved in Falcon than I want to deal with right now.	2018-11-01 00:19:39 +02:00
Alan Orth	2f45d27554	dspace_statistics_api/app.py: remove unused code This was added accidentally when I refactored. I was trying to see if I could use Falcon's on_exit() hook.	2018-10-28 11:14:21 +02:00
Alan Orth	b8356f7a87	Add "application" alias to API object By default gunicorn looks for an "application" object to run, so this saves us having to type api:app.	2018-10-28 11:14:21 +02:00
Alan Orth	c027f01b48	Refactor project structure This follows guidance from several well-known Python best practices guides. Basically, the idea is create a package for the application that is comprised of several re-usable modules. See: https://docs.python-guide.org/writing/structure/ See: https://realpython.com/python-application-layouts/	2018-10-28 11:14:21 +02:00

24 Commits