Minor change to bot filtering. We should use a negated match for
documents that have `isBot:true` rather than looking for documents
that are tagged with `isBot:false` (the distinction is subtle, but
important).
The basic logic is similar to items, where you can request single
item statistics with a UUID, all item statistics, and item statis-
tics for a list of items (optionally with a date range). Most of
the item code was re-purposed to work on "elements", which can be
items, communities, or collections depending on the request, with
the use of Falcon's `before` hooks to set the statistics scope so
we know how to behave for the current request.
Other than the minor difference in facet fields, another issue I
had with communities and collections is that the owningComm and
owningColl fields are multi-valued (unlike items' id field). This
means that, when you facet the results of your query, Solr returns
ids that seem unrelated, but are actually present in the field, so
I had to make sure I checked all returned ids to see if they were
in the user's POSTed elements list.
TODO:
- Add tests
- Revise docstrings
- Refactor items.py as it is now generic
The logic to get views and downloads is very similar to that used
for items, but we facet by different fields. This uses a generic
function for indexing that takes an "indexType" and a "facetField"
parameter. The indexType parameter controls which database table
to insert into, and the facetField parameter indicates which field
to facet by in Solr.
A few months ago I had an issue setting up mocking because I was
trying to be clever importing these libraries only when I needed
them rather than at the global scope. Someone pointed out to me
that if the imports are at the top of the file Falcon will load
them once when the WSGI server starts, whereas if they are in the
on_get() or on_post() they will load for every request! Also, it
seems that PEP8 recommends keeping imports at the top of the file
anyways, so I will just do that.
Imports sorted with isort.
See: https://www.python.org/dev/peps/pep-0008/#imports
When indexing item views and downloads the only field we need is the
the id. The `fl` parameter tells Solr which fields to return in the
search results. This should theoretically be more efficient, though
I don't have any time to figure out how to measure it right now.
It appears to be needed to compile typed-ast:
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Iast27/Include -I/usr/local/include/python3.9 -c ast27/Custom/typed_ast.c -o build/temp.linux-x86_64-3.9/ast27/Custom/typed_ast.o
error: command 'gcc' failed: No such file or directory
----------------------------------------
ERROR: Failed building wheel for typed-ast
It is missing a six dependency which causes the build to fail. I
could simply add six to the virtualenv but it feels dirty. I don't
actually *need* pytest-clarity for anything so I'll just remove it.
See: https://github.com/darrenburns/pytest-clarity/issues/14
Uses multiple pipelines to test several versions of Python. A few
things to note:
- I use the -slim Python packages, which are smaller and yet still
have no problem installing psycopg2-binary with pip
- I have to start a PostgreSQL database service for each pipeline
separately
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
The `--without-hashes` is required to work around an issue with
gunicorn pulling in a dependency on setuptools that poetry ignores.
See: https://github.com/python-poetry/poetry/issues/1584
Minor change to bot filtering. We should use a negated match for
documents that have `isBot:true` rather than looking for documents
that are tagged with `isBot:false` (the distinction is subtle, but
important).
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
The `--without-hashes` is required to work around an issue with
gunicorn pulling in a dependency on setuptools that poetry ignores.
See: https://github.com/python-poetry/poetry/issues/1584