When a user explicitly requests that a field be excluded with -x we
skip that field in most checks. Up until now that did not include
the item-based checks using a transposed dataframe because we don't
know the metadata field names (labels) until we iterate over them.
Now the excludes are respected for item-based checks.
We need to make sure we're only manipulating the regions if we have
any missing. The previous code was always manipulating the existing
row, even when there were no missing regions, which resulted in new
values like "Eastern Africa||".
By default country_converter prints "not found in regex" if a coun-
try is not found. We can silence this by switching the logging lev-
el to something above WARNING.
The country_converter documentation says we should instantiate the
CountryConverter() class once instead of calling coco.convert() in
each iteration of the loop so we don't end up loading the data file
more than once.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
It seems there was another logic error raised by the test in pytest.
With my real data, it was enough to check if the region column was
None, but with my test I was explicitly setting the region to "" (an
empty string). So to be really sure we should check if the string
is not None *and* if its length is greater than 0.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
That's what I use for testing locally. Note that we need to quote
the version here because otherwise GitHub Actions will interpret it
as 3.1 due to how YAML works.
Now that we can drop invalid AGROVOC values we should have a valid
value and an invalid value here. Depending on how the checker is
invoked we will either print a warning or drop the invalid value.
Generated with poetry export:
$ poetry export --without-hashes -f requirements.txt > requirements.txt
$ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
I am trying `--without-hashes` to work around an error on pip install
when running in CI:
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
We actually want to do this after we try to fix mojibake with ftfy.
These "unnecessary" Unicode characters could actually help ftfy in
some cases because often times they indicate that some character
from another encoding was there before (like an accent, dash, or
smart quote).