Fix the incorrect type field regex, and improve the title regex to
consider dcterms.title and dc.title (along with the DSpace language
variants like dc.title[en_US]), but ignore dc.title.alternative.
See: https://regex101.com/r/I4m06F/1
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
SQLAlchemy gets pulled in by csvkit's agate-sql dependency and there
is currently an issue with Poetry's parsing of the SQLAlchemy 1.4.23
constraints. Temporarily explicitly install a version of SQLAlchemy
that works (can remove later once Poetry fixes this). Anyways, I am
not using any SQLAlchemy features that I know of.
See: https://github.com/python-poetry/poetry/issues/4402
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
These seem to have much newer versions that didn't get updated in
this project due to the version pinning selector I was using with
poetry.
In the case of pytest-clarity the previous version was 0.3.1 and
the version selector was a caret (^), which will never update the
left-most (major) number. Now they seem to be on 1.x.x so it will
be OK in the future.
In the case of black, they use weird numbering so it's anyone's
guess how this will work! Luckily it's only used for linting and
formatting.
Apparently we were stuck on an older version of requests-cache due
to the fact that we were using the caret, which will never update
the left-most (major) version. Upstream requests-cache is currently
version 0.6.4, and there seems to have been some changes to the API.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have
their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have
their versions pinned with ==.
This detects whether text has likely been encoded in one encoding
and decoded in another, perhaps multiple times. This often results
in display of "mojibake" characters.
For example, a file encoded in UTF-8 is opened as CP-1252 (Windows
Latin codepage) in Microsoft Excel, and saved again as UTF-8. You
will see strings like this in the resulting file:
- CIAT Publicaçao
- CIAT Publicación
The correct version of these in UTF-8 would be:
- CIAT Publicaçao
- CIAT Publicación
I use a code snippet from Martijn Pieters on StackOverflow to de-
tect whether a string is "weird" as determined by the excellent
"fixes text for you" (ftfy) Python library, then check if a weird
string encodes as CP-1252 or not. If so, I can try to fix it.
See: https://stackoverflow.com/questions/29071995/identify-garbage-unicode-string-using-python
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have
their versions pinned with ==.
PEP8 recommends keeping imports at the top of the file. Also, I had
to re-work the issn/isbn so they didn't conflict with the functions
in check.py (flake8 warned about them being redefined).
Imports sorted with isort.
See: https://www.python.org/dev/peps/pep-0008/#imports
By using df[column] = df[column].apply(check...) we were re-writing
the DataFrame every time we returned from a check. We don't actuall
y need to return a value at all, as the point of checks is to print
a warning to the screen. In Python a "return" statement without a v
ariable returns None.
I haven't measured the impact of this, but I assume it will mean we
are faster and use less memory.