dspace-statistics-api

mirror of https://github.com/ilri/dspace-statistics-api.git synced 2025-09-09 13:21:48 +02:00

Author	SHA1	Message	Date
Alan Orth	b15afc9f39	CHANGELOG.md: Add note about UUIDs All checks were successful continuous-integration/drone/push Build is passing Details	2021-01-05 12:41:21 +02:00
Alan Orth	2bc18ef719	README.md: Make a note about migrating UUIDs	2021-01-05 12:35:23 +02:00
Alan Orth	49751b53f0	dspace_statistics_api/indexer.py: Limit to UUIDs We need to make sure that the indexer only tries to index UUIDs, as opposed to legacy IDs that may have been left over from a migration from earlier DSpace versions. For example, "98110-unmigrated", "-1" etc. For matching the UUIDs in Solr I decided that it is sufficient for our use case to simply match thirty-six characters, where a UUID is composed of thirty-two hexadecimal characters and four dashes. We don't need to do any verification of "real" UUIDs because it would be needlessly complex in our case. See: https://github.com/ilri/dspace-statistics-api/issues/12	2021-01-05 12:30:27 +02:00
Alan Orth	d1c177e146	.drone.yml: Add git to python container All checks were successful continuous-integration/drone/push Build is passing Details Now that I am installing my own fork of falcon-swagger-ui we need to have git so we can install it with pip.	2020-12-27 14:22:23 +02:00
Alan Orth	33dc210452	dspace_statistics_api/docs/openapi.json: Minor edit Better to leave the version in there because Swagger Editor doesn't like it without. Also, change the example page parameter for POSTing to /items and /collections, as it doesn't make sense to start on a later page if we have less items than our limit. v1.4.0	2020-12-27 13:53:59 +02:00
Alan Orth	282d5f644a	Move unreleased change to v1.4.0	2020-12-27 12:52:24 +02:00
Alan Orth	05e0e8bdca	openapi.json: Set the API version from config We don't need to hard code this in the JSON anymore since we are reading and modifying it now for the server config anyways.	2020-12-27 12:48:13 +02:00
Alan Orth	2567bb8604	dspace_statistics_api/app.py: Format with black	2020-12-27 12:27:01 +02:00
Alan Orth	4af3c656a3	CHANGELOG.md: Add note about totalPages	2020-12-27 12:26:32 +02:00
Alan Orth	4f8cd1097b	Rework paging The "totalPages" value in our response is calculated incorrectly. Instead of casting to int and rounding, we should rather round up to the next integer with math.ceil. This is a more correct way to get the value. Also update the indexer to use the same logic, although there the values are printed with +1 so they are more readable.	2020-12-27 12:22:07 +02:00
Alan Orth	a02211fd60	Update requirements Some checks failed continuous-integration/drone/push Build is failing Details Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt The `--without-hashes` is required to work around an issue with gunicorn pulling in a dependency on setuptools that poetry ignores. See: https://github.com/python-poetry/poetry/issues/1584 v1.4.0-dev	2020-12-25 13:03:32 +02:00
Alan Orth	fc814593c7	Use my fork of falcon-swagger-ui It has a newer Swagger UI (v3.38.0).	2020-12-25 12:57:58 +02:00
Alan Orth	7de1084f60	Add whitespace before vim modeline All checks were successful continuous-integration/drone/push Build is passing Details black wants this...	2020-12-24 13:12:06 +02:00
Alan Orth	6b78e82fe9	Add vim modeline to all tests	2020-12-24 13:11:12 +02:00
Alan Orth	4004515967	pyproject.toml: Update description All checks were successful continuous-integration/drone/push Build is passing Details	2020-12-23 16:15:46 +02:00
Alan Orth	d1229c2387	Adjust docs at root Don't use a static HTML file anymore. Now I simply print an XHTML page from the Falcon resource. This way I can use variables to add in the API version as well as a link to the Swagger UI. The list of API calls is still present on the README.md, though in the long run I might move them to some dedicated documentation or a GitHub wiki.	2020-12-23 16:12:50 +02:00
Alan Orth	be83514de1	Re-work Swagger UI configuration All checks were successful continuous-integration/drone/push Build is passing Details It turns out that Swagger UI mostly does the "right" thing for our use cases here, but it assumes that API paths are relative to the root of the host where it is being served. This works in the local development environment because we are serving on "/", but it does not work in production where the API is deployed beneath the DSpace REST API, for example at "/rest/statistics". The solution here is to allow configuration of the DSpace Statistics API path and use that when registering the Swagger UI as well as in a new "server" block in the OpenAPI JSON schema. By default it is configured to work out of the box in a development environment. Set the DSPACE_STATISTICS_API_URL environment variable to something like "/rest/statistics" when running in production.	2020-12-23 13:25:17 +02:00
Alan Orth	70b2ba83ba	Allow configuration of Swagger and OpenAPI JSON URL All checks were successful continuous-integration/drone/push Build is passing Details When running in production your statistics API might be deployed to a path like /rest/statistics instead of at the root.	2020-12-22 12:50:03 +02:00
Alan Orth	893039bc6a	Update requirements Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt The `--without-hashes` is required to work around an issue with gunicorn pulling in a dependency on setuptools that poetry ignores. See: https://github.com/python-poetry/poetry/issues/1584	2020-12-22 12:07:07 +02:00
Alan Orth	a4628dde4e	Update requirements Generated with poetry export: $ poetry export -f requirements.txt > requirements.txt $ poetry export --dev -f requirements.txt > requirements-dev.txt	2020-12-22 11:45:39 +02:00
Alan Orth	68418ea053	dspace_statistics_api/docs/openapi.json: Add /status Add a /status to the Swagger UI schema.	2020-12-22 11:41:47 +02:00
Alan Orth	6bbee7919e	Bump version to 1.4.0-dev	2020-12-22 11:31:46 +02:00
Alan Orth	8f0061ce29	CHANGELOG.md: Add note about the /status page	2020-12-22 11:30:50 +02:00
Alan Orth	4b1398c67f	Add /status route Currently this only prints the API version.	2020-12-22 11:30:09 +02:00
Alan Orth	a9d2a6d9be	CHANGELOG.md: Add note about Swagger UI	2020-12-22 11:21:46 +02:00
Alan Orth	a35ecf2394	Add Swagger UI on /swagger This includes a Swagger UI with an OpenAPI 3.0 JSON schema for easy interactive demonstration and testing of the API. The JSON schema was created with the standalone swagger-editor. Includes tests to make sure that the /swagger and /docs/openapi.json paths are acce- ssible.	2020-12-22 11:18:47 +02:00
Alan Orth	3e271c7852	tests/dspacestatistics.sql: Update data All checks were successful continuous-integration/drone/push Build is passing Details Add a new database snapshot with communities and collections.	2020-12-20 22:31:41 +02:00
Alan Orth	d7ba14c590	tests: Add tests for communities and collections Also, separate tests for items, communities, and collections into their own files, leaving a single test for docs in its own file.	2020-12-20 22:12:13 +02:00
Alan Orth	ab82e90773	dspace_statistics_api/stats.py: Use -isBot:true All checks were successful continuous-integration/drone/push Build is passing Details Minor change to bot filtering. We should use a negated match for documents that have `isBot:true` rather than looking for documents that are tagged with `isBot:false` (the distinction is subtle, but important).	2020-12-20 16:56:03 +02:00
Alan Orth	8a1244d2d0	Update changelog and docs	2020-12-20 16:45:49 +02:00
Alan Orth	04f0756c7f	dspace_statistics_api/util.py: Add vim modeline	2020-12-20 16:31:52 +02:00
Alan Orth	830e4415f5	dspace_statistics_api/app.py: Run isort	2020-12-20 16:29:35 +02:00
Alan Orth	47b4eb3df7	Rename items.py to stats.py It is no longer used only for item-related statistics functions.	2020-12-20 16:28:56 +02:00
Alan Orth	3339bf8d9c	Add communities and collections support to API The basic logic is similar to items, where you can request single item statistics with a UUID, all item statistics, and item statis- tics for a list of items (optionally with a date range). Most of the item code was re-purposed to work on "elements", which can be items, communities, or collections depending on the request, with the use of Falcon's `before` hooks to set the statistics scope so we know how to behave for the current request. Other than the minor difference in facet fields, another issue I had with communities and collections is that the owningComm and owningColl fields are multi-valued (unlike items' id field). This means that, when you facet the results of your query, Solr returns ids that seem unrelated, but are actually present in the field, so I had to make sure I checked all returned ids to see if they were in the user's POSTed elements list. TODO: - Add tests - Revise docstrings - Refactor items.py as it is now generic	2020-12-20 16:14:46 +02:00
Alan Orth	fba6f1ead1	CHANGELOG.md: Update unreleased changes All checks were successful continuous-integration/drone/push Build is passing Details	2020-12-18 22:54:01 +02:00
Alan Orth	20c8ba0cf8	indexer.py: Add support for communities and collections The logic to get views and downloads is very similar to that used for items, but we facet by different fields. This uses a generic function for indexing that takes an "indexType" and a "facetField" parameter. The indexType parameter controls which database table to insert into, and the facetField parameter indicates which field to facet by in Solr.	2020-12-18 22:53:16 +02:00
Alan Orth	b486f51dd7	indexer.py: Rename index functions for items Start making plans for indexing communities and collections.	2020-12-18 22:53:16 +02:00
Alan Orth	787eec20ea	CHANGELOG.md: Add note about imports All checks were successful continuous-integration/drone/push Build is passing Details	2020-12-18 22:52:14 +02:00
Alan Orth	9e6fcf279b	dspace_statistics_api/items.py: Format with black	2020-12-18 22:45:39 +02:00
Alan Orth	4dbf734a4b	Move all imports to top of file A few months ago I had an issue setting up mocking because I was trying to be clever importing these libraries only when I needed them rather than at the global scope. Someone pointed out to me that if the imports are at the top of the file Falcon will load them once when the WSGI server starts, whereas if they are in the on_get() or on_post() they will load for every request! Also, it seems that PEP8 recommends keeping imports at the top of the file anyways, so I will just do that. Imports sorted with isort. See: https://www.python.org/dev/peps/pep-0008/#imports	2020-12-18 22:42:06 +02:00
Alan Orth	a0d0a47150	items.py: Add fl paramter to Solr queries I forgot to add the fl parameter here as well.	2020-12-18 16:12:34 +02:00
Alan Orth	01e9756cf2	Update requirements All checks were successful continuous-integration/drone/push Build is passing Details Generated with poetry export: $ poetry export -f requirements.txt > requirements.txt $ poetry export --dev -f requirements.txt > requirements-dev.txt	2020-12-18 11:20:17 +02:00
Alan Orth	b2b4eb2939	poetry.lock: Run poetry update	2020-12-18 11:19:16 +02:00
Alan Orth	4bbbaa4af3	dspace_statistics_api/indexer.py: Use `fl` parameter All checks were successful continuous-integration/drone/push Build is passing Details I forgot to add the fl parameter to the downloads function.	2020-12-18 10:44:02 +02:00
Alan Orth	7e4d5f4b13	README.md: Minor edit to intro	2020-12-18 10:42:48 +02:00
Alan Orth	428172854d	README.md: Add TODO All checks were successful continuous-integration/drone/push Build is passing Details	2020-12-17 20:44:25 +02:00
Alan Orth	2707cb37d5	CHANGELOG.md: Add note about fl parameter Some checks failed continuous-integration/drone/push Build is failing Details	2020-12-17 12:27:11 +02:00
Alan Orth	2407aeec70	dspace_statistics_api/indexer.py: Use `fl` parameter When indexing item views and downloads the only field we need is the the id. The `fl` parameter tells Solr which fields to return in the search results. This should theoretically be more efficient, though I don't have any time to figure out how to measure it right now.	2020-12-17 12:25:28 +02:00
Alan Orth	f3a0e3a671	CHANGELOG.md: Add note about ORDER BY All checks were successful continuous-integration/drone/push Build is passing Details	2020-12-17 10:17:23 +02:00
Alan Orth	4590fc8708	dspace_statistics_api/app.py: Use ORDER BY in /items Since we are paging through the results by limit/offset we need to be sure that we are returning results deterministically.	2020-12-17 10:10:40 +02:00

1 2 3 4 5 ...

424 Commits