1
0
mirror of https://github.com/ilri/dspace-statistics-api.git synced 2024-11-16 19:37:04 +01:00
Commit Graph

405 Commits

Author SHA1 Message Date
810508d038
dspace_statistics_api/indexer.py: Use -isBot:true
Minor change to bot filtering. We should use a negated match for
documents that have `isBot:true` rather than looking for documents
that are tagged with `isBot:false` (the distinction is subtle, but
important).
2020-11-17 17:40:08 +02:00
ecafab57cb
README.md: Update DSpace version note 2020-11-16 16:16:21 +02:00
9c9431b58c
CHANGELOG.md: Add unreleased changes 2020-11-02 22:14:18 +02:00
2d6520fc97
Fix limit in docs 2020-11-02 22:14:08 +02:00
79a393d33f
Update requirements
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

The `--without-hashes` is required to work around an issue with
gunicorn pulling in a dependency on setuptools that poetry ignores.

See: https://github.com/python-poetry/poetry/issues/1584
2020-11-02 22:10:29 +02:00
149f6c418f
poetry.lock: Run poetry update 2020-11-02 22:00:29 +02:00
ca1582a8b6
Make sure limit is between 1 and 100
We were not properly checking whether the limit was greater than 0
in all cases.
2020-11-02 21:59:20 +02:00
1904c243a4
Revert ".travis.yml: Use Ubuntu 20.04 "Focal" environment"
This reverts commit 0baa07f70a.

Focal only has PostgreSQL 12 installed, and we are not quite there
yet (our production has 9.6, testing has 10).
2020-10-29 00:11:28 +03:00
0baa07f70a
.travis.yml: Use Ubuntu 20.04 "Focal" environment 2020-10-29 00:04:47 +03:00
59214ffcb6
.travis.yml: Bump Python versions
Test Python 3.9 now that it was released, and allow tests to fail
on nightly builds.
2020-10-29 00:03:58 +03:00
549b8bf1a7
dspace_statistics_api/docs/index.html: Fix version
We need to print it in the body, not the title.
2020-10-06 22:22:11 +03:00
899a79b2e7
Version 1.3.1 2020-10-06 22:15:52 +03:00
4c59469055
Update requirements
Generated with poetry export:

    $ poetry export --no-hashes -f requirements.txt > requirements.txt
    $ poetry export --no-hashes --dev -f requirements.txt > requirements-dev.txt

The `--no-hashes` is required to work around an issue with gunicorn
pulling in a dependency on setuptools that poetry ignores.

See: https://github.com/python-poetry/poetry/issues/1584
2020-10-06 22:11:54 +03:00
4e9064329d
Bump version to 1.3.0 2020-10-06 21:33:38 +03:00
4958d5d2e9
pyproject.toml: Fix email 2020-10-06 18:46:31 +03:00
923ed0a434
tests/test_api.py: Add tests for /items POST handlers
This adds tests for the new /items POST handler, both with mocked
data and a live connection to a Solr statistics core. Tests that
only work when Solr is available are marked with XFAIL so that they
don't turn the whole test suite red.

In each test I try to assert as many parameters as we can know for
each response so that we cover all expectations. For example, when
we test a valid limit parameter we should test whether the response
not only has the same limit parameter, but that the number of items
has actually been limited and the number of pages has been adjusted
accordingly.

See: https://docs.pytest.org/en/stable/skipping.html
2020-10-06 16:51:53 +03:00
5acd927210
dspace_statistics_api: Sort imports with isort 2020-10-06 15:12:13 +03:00
630fa0d5fb
dspace_statistics_api/util.py: Fix f-strings
flake8 raised this warning:

    F541 f-string is missing placeholders
2020-10-06 15:11:12 +03:00
58d2b8d4ed
dspace_statistics_api/items.py: Move util import
Move util import from global scope because it causes tests to fail.
We don't need the set up the Solr connection unless we're actually
trying to use the get_views and get_downloads methods, either when
running the API in production or during tests where the connection
has been set up.
2020-10-06 15:07:00 +03:00
e6572d9469
.build.yml: Use poetry instead of pipenv 2020-10-05 22:37:42 +03:00
85fca81611
Update requirements
Generated with poetry export:

    $ poetry export -f requirements.txt > requirements.txt
    $ poetry export --dev -f requirements.txt > requirements-dev.txt
2020-10-05 22:33:27 +03:00
c81a8d03d7
Remove pipenv configuration
I was having issues with re-creating an environment from scratch:

    ModuleNotFoundError: No module named 'virtualenv.seed.via_app_data'

Switching to Poetry for now.
2020-10-05 22:32:01 +03:00
2923a3b325
Add Poetry configuration
I was having some problems with pipenv when trying to install a
clean environment:

    ModuleNotFoundError: No module named 'virtualenv.seed.via_app_data'
2020-10-05 22:30:35 +03:00
d4518d62ad
dspace_statistics_api/app.py: Refactor for testability
I thought it was clever to only import these in the on_post handler
because they aren't needed elsewhere, but it turns out that this is
not a common pattern and even causes problems with testability.

First, if the imports are at the top of the file as PEP8 recommends,
then the WSGI server will import them once when it loads the app and
they remain in memory for the lifecycle of the app. If the imports
are in the on_post handler they would be re-imported on every request!

Second, this pattern of importing in a method makes it tricky to use
object patching in mocks.

See: https://www.python.org/dev/peps/pep-0008/#imports
2020-10-05 20:43:50 +03:00
3a98de78e3
dspace_statistics_api/items.py: Remove executable bit
We don't need to execute this on the command line.
2020-10-05 14:33:36 +03:00
b26439daf3
CHANGELOG.md: Add note about Python dependencies 2020-09-27 11:16:37 +03:00
9e898ba54f
Update requirements
Generated from pipenv with:

  $ pipenv lock -r > requirements.txt
  $ pipenv lock -r -d > requirements-dev.txt
2020-09-27 11:13:38 +03:00
716d65030c
Pipfile.lock: Run pipenv update 2020-09-27 11:13:04 +03:00
5a53b57b3b
Refactor /items POST handler to use a before hook
This allows us to do the dirty work of parsing, validating, and
setting local variables from the POST parameters outside of the
on_post function. We then share the parameters via the req.context
object. Functionally it is the same, but readability is better
and it's a neat trick that I could use elsewhere.

See: https://falcon.readthedocs.io/en/stable/user/faq.html#how-can-i-pass-data-from-a-hook-to-a-responder-and-between-hooks
2020-09-26 18:40:52 +03:00
3ceb9a6eb0
dspace_statistics_api/items.py: Fix flake8 warning
According to flake8 we need to use a different syntax for strings
with backslash escape sequences:

> As of Python 3.6, a backslash-character pair that is not a valid
> escape sequence now generates a DeprecationWarning. This will
> eventually become a SyntaxError.

The warning was:

    W605 invalid escape sequence '\-'

See: https://www.flake8rules.com/rules/W605.html
2020-09-26 12:22:06 +03:00
946f0749e2
dspace_statistics_api/app.py: Use bounded_stream in on_post
For reasons I don't quite understand, we need to use bounded_stream
in the on_post request handler in order to use simulate_post() with
the testing client in Falcon 2.0.0. Normal runtime operation via
gunicorn does not have any issues with stream.

See: https://github.com/falconry/falcon/issues/1720
See: https://github.com/falconry/falcon/issues/1554
2020-09-26 11:50:57 +03:00
b06651d1ec
dspace_statistics_api/indexer.py: Fix Python comment 2020-09-25 13:35:05 +03:00
a0ee181361
dspace_statistics_api/docs/index.html: Fix whitespace 2020-09-25 13:33:45 +03:00
f58c209609
dspace_statistics_api/indexer.py: Update comment
I don't remember why we needed the stats, but it seems that it was
because without them there is no way to know how many results were
returned and therefore no way to know how many pages we'll need to
iterate over. Having the total number allows us to use a limit and
and offset to page through them deterministically.
2020-09-25 13:25:34 +03:00
6dbff1e78f
README.md: Capitalize UUID 2020-09-25 13:03:15 +03:00
731226ec15
README.md: Update for POST /items functionality 2020-09-25 13:01:09 +03:00
2201d3df4e
dspace_statistics_api/docs/index.html: Minor HTML syntax issue 2020-09-25 12:55:39 +03:00
2c0436f845
Update API docs HTML for /items POST functionality 2020-09-25 12:53:30 +03:00
f1e939481b
dspace_statistics_api/items.py: Remove shebang
This was originally a standalone script I was testing interactively.
2020-09-25 12:39:00 +03:00
4d8026a3d0
Add missing dspace_statistics_api/items.py
This was meant to be added with the new /items POST changes.
2020-09-25 12:30:06 +03:00
de1f462ad2
CHANGELOG.md: Add more notes about /items 2020-09-25 12:29:19 +03:00
8a6bbfd527
CHANGELOG.md: Add note about /items 2020-09-25 12:27:33 +03:00
73c71fa8a0
dspace_statistics_api: Add support for date ranges to /items
You can now POST a JSON request to /items with a list of items and
a date range. This allows the possibility to get view and download
statistics for arbitrary items and arbitrary date ranges.

The JSON request should be in the following format:

    {
        "limit": 100,
        "page": 0,
        "dateFrom": "2020-01-01T00:00:00Z",
        "dateTo": "2020-09-09T00:00:00Z",
        "items": [
            "f44cf173-2344-4eb2-8f00-ee55df32c76f",
            "2324aa41-e9de-4a2b-bc36-16241464683e",
            "8542f9da-9ce1-4614-abf4-f2e3fdb4b305",
            "0fe573e7-042a-4240-a4d9-753b61233908"
        ]
    }

The limit, page, and date parameters are all optional. By default
it will use a limit of 100, page 0, and [* TO *] Solr date range.
2020-09-25 12:21:11 +03:00
7a5e14716d
CHANGELOG.md: Add note about indexer refactoring 2020-09-24 12:08:21 +03:00
21b500b4f7
dspace_statistics_api/util.py: Use docstring for get_statistics_shards
It seems better to use a docstring instead of a comment because it
can potentially be used by IDEs or documentation generators.
2020-09-24 12:07:31 +03:00
495386856b
Refactor indexer
Move the get_statistics_shards() method to a utility module so it
can be used by other things.
2020-09-24 12:03:12 +03:00
8e87f80e9a
dspace_statistics_api/indexer.py: Remove duplicate solr_url variable
This is declared twice and it never changes.
2020-09-24 11:54:31 +03:00
c4bf8bf698
README.md: Add TODO note about sorting by views or downloads 2020-09-24 11:53:23 +03:00
6ff95bb5f2
dspace_statistics_api/indexer.py: Remove SolrClient reference
We stopped using SolrClient in favor of vanilla requests.
2020-09-24 11:30:31 +03:00
0c8fb21f80
README.md: Update DSpace wiki URLs 2020-04-13 15:25:17 +03:00