Apparently we were stuck on an older version of requests-cache due
to the fact that we were using the caret, which will never update
the left-most (major) version. Upstream requests-cache is currently
version 0.6.4, and there seems to have been some changes to the API.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have
their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have
their versions pinned with ==.
This detects whether text has likely been encoded in one encoding
and decoded in another, perhaps multiple times. This often results
in display of "mojibake" characters.
For example, a file encoded in UTF-8 is opened as CP-1252 (Windows
Latin codepage) in Microsoft Excel, and saved again as UTF-8. You
will see strings like this in the resulting file:
- CIAT Publicaçao
- CIAT Publicación
The correct version of these in UTF-8 would be:
- CIAT Publicaçao
- CIAT Publicación
I use a code snippet from Martijn Pieters on StackOverflow to de-
tect whether a string is "weird" as determined by the excellent
"fixes text for you" (ftfy) Python library, then check if a weird
string encodes as CP-1252 or not. If so, I can try to fix it.
See: https://stackoverflow.com/questions/29071995/identify-garbage-unicode-string-using-python
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have
their versions pinned with ==.
PEP8 recommends keeping imports at the top of the file. Also, I had
to re-work the issn/isbn so they didn't conflict with the functions
in check.py (flake8 warned about them being redefined).
Imports sorted with isort.
See: https://www.python.org/dev/peps/pep-0008/#imports
By using df[column] = df[column].apply(check...) we were re-writing
the DataFrame every time we returned from a check. We don't actuall
y need to return a value at all, as the point of checks is to print
a warning to the screen. In Python a "return" statement without a v
ariable returns None.
I haven't measured the impact of this, but I assume it will mean we
are faster and use less memory.
Allow overriding the directory for the requests cache. In the case
of csv-metadata-quality-web, which currently runs on Google's App
Engine, we can only write to /tmp.
This is no longer class-ified as "unsafe" as I have yet to see a
case where this was intentional, and it always causes issues when
you import the data in a DSpace repository.
I now use this version in my development environment. Eventually I
should add a matrix of versions to use, but I don't know the GitHub
Actions syntax well enough yet.