1
0
mirror of https://github.com/ilri/dspace-statistics-api.git synced 2024-10-18 06:44:16 +02:00

Compare commits

...

3 Commits

Author SHA1 Message Date
b15afc9f39
CHANGELOG.md: Add note about UUIDs
All checks were successful
continuous-integration/drone/push Build is passing
2021-01-05 12:41:21 +02:00
2bc18ef719
README.md: Make a note about migrating UUIDs 2021-01-05 12:35:23 +02:00
49751b53f0
dspace_statistics_api/indexer.py: Limit to UUIDs
We need to make sure that the indexer only tries to index UUIDs, as
opposed to legacy IDs that may have been left over from a migration
from earlier DSpace versions. For example, "98110-unmigrated", "-1"
etc.

For matching the UUIDs in Solr I decided that it is sufficient for
our use case to simply match thirty-six characters, where a UUID is
composed of thirty-two hexadecimal characters and four dashes. We
don't need to do any verification of "real" UUIDs because it would
be needlessly complex in our case.

See: https://github.com/ilri/dspace-statistics-api/issues/12
2021-01-05 12:30:27 +02:00
3 changed files with 9 additions and 4 deletions

View File

@ -4,6 +4,10 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## Unreleased Changes
### Changed
- Limit Solr query to UUIDs to avoid errors with unmigrated legacy stats (https://github.com/ilri/dspace-statistics-api/issues/12)
## [1.4.0] - 2020-12-27
### Added
- indexer.py now indexes views and downloads for communities and collections

View File

@ -3,6 +3,7 @@ DSpace stores item view and download events in a Solr "statistics" core. This in
- If your DSpace is version 4 or 5, use [dspace-statistics-api v1.1.1](https://github.com/ilri/dspace-statistics-api/releases/tag/v1.1.1)
- If your DSpace is version 6+, use [dspace-statistics-api v1.2.0 or greater](https://github.com/ilri/dspace-statistics-api/releases/tag/v1.2.0)
- Please make sure your statistics have been migrated from integers to UUIDs with the [solr-upgrade-statistics-6x](https://wiki.lyrasis.org/display/DSDOC6x/SOLR+Statistics+Maintenance) command
This project contains an indexer and a [Falcon-based](https://falcon.readthedocs.io/) web application to make the item, community, and collection statistics available via a simple REST API. You can read more about the Solr queries used to gather the item view and download statistics on the [DSpace wiki](https://wiki.lyrasis.org/display/DSPACE/Solr).

View File

@ -47,7 +47,7 @@ def index_views(indexType: str, facetField: str):
#
# see: https://lucene.apache.org/solr/guide/6_6/the-stats-component.html
solr_query_params = {
"q": "type:2",
"q": f"type:2 AND {facetField}:/.{{36}}/",
"fq": "-isBot:true AND statistics_type:view",
"fl": facetField,
"facet": "true",
@ -94,7 +94,7 @@ def index_views(indexType: str, facetField: str):
)
solr_query_params = {
"q": "type:2",
"q": f"type:2 AND {facetField}:/.{{36}}/",
"fq": "-isBot:true AND statistics_type:view",
"fl": facetField,
"facet": "true",
@ -130,7 +130,7 @@ def index_views(indexType: str, facetField: str):
def index_downloads(indexType: str, facetField: str):
# get the total number of distinct facets for items with at least 1 download
solr_query_params = {
"q": "type:0",
"q": f"type:0 AND {facetField}:/.{{36}}/",
"fq": "-isBot:true AND statistics_type:view AND bundleName:ORIGINAL",
"fl": facetField,
"facet": "true",
@ -176,7 +176,7 @@ def index_downloads(indexType: str, facetField: str):
)
solr_query_params = {
"q": "type:0",
"q": f"type:0 AND {facetField}:/.{{36}}/",
"fq": "-isBot:true AND statistics_type:view AND bundleName:ORIGINAL",
"fl": facetField,
"facet": "true",