Commit Graph

619 Commits

Author SHA1 Message Date
Alan Orth fdccdf7318
Version 0.6.1
continuous-integration/drone/push Build is failing Details
2023-02-23 13:46:56 +03:00
Alan Orth ff2c986eec
setup.py: minimum python 3.9 2023-02-23 11:47:40 +03:00
Alan Orth 547574866e
Update requirements
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2023-02-23 11:46:24 +03:00
Alan Orth 8aa7b93d87
poetry.lock: run poetry update 2023-02-23 11:45:53 +03:00
Alan Orth 53fdb50906
csv_metadata_quality/check.py: run black
continuous-integration/drone/push Build is failing Details
2023-02-18 22:10:04 +03:00
Alan Orth 3e0e9a7f8b
poetry.lock: run poetry update 2023-02-18 22:09:33 +03:00
Alan Orth 03d824b78e
pyproject.toml: update some dependencies 2023-02-18 22:09:05 +03:00
Alan Orth 8bc4cd419c
Strip filename descriptions before checking
continuous-integration/drone/push Build is failing Details
When checking for uncommon file extensions in the filename field
we should strip descriptions that are meant for SAF Bundler, for
example: Annual_Report_2020.pdf__description:Report. This ends up
as a false positive that spams the output with warnings.
2023-02-13 11:00:57 +03:00
Alan Orth bde38e9ed4
CHANGELOG.md: add notes about abstracts 2023-02-13 10:39:03 +03:00
Alan Orth 8db1e36a6d
csv_metadata_quality/app.py: skip abstract in separator check
Also skip abstract in the separator check, since it's rare to have
any "|" here, but more likely that if one is present then it's for
a reason.
2023-02-13 10:37:33 +03:00
Alan Orth fbb625be5c
Ignore common non-SPDX licenses
This is meant to catch licenses that are supposed to be SPDX but
aren't, not licenses that *aren't* supposed to be SPDX. We have so
many free-text license descriptions like "Copyrighted" and "Other"
that I'm sick of seeing warnings for them!
2023-02-07 17:01:56 +03:00
Alan Orth 084b970798
CHANGELOG.md: add note about abstract field 2023-02-07 16:52:34 +03:00
Alan Orth 171b35b015
Add data/abstract-check.csv
A test file with several whitespace and newline scenarios in the
abstract. I am currently disabling whitespace/newline fixes in the
abstract because they are too agressive.
2023-02-07 16:50:47 +03:00
Alan Orth 545bb8cd0c
csv_metadata_quality/app.py: disable whitespace on abstracts
It's too aggressive on abstracts. If people paste in text from a
PDF there are often newlines, and most of the time this is what
they want.
2023-02-07 16:48:40 +03:00
Alan Orth d5afbad788
Update requirements
continuous-integration/drone/push Build is failing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2023-01-24 14:18:19 +03:00
Alan Orth d40c9ed97a
poetry.lock: run poetry update 2023-01-24 14:17:44 +03:00
Alan Orth c4a2ee8563
CHANGELOG.md: add note about fix.separators() 2023-01-24 14:16:23 +03:00
Alan Orth 3596381d03
csv_metadata_quality/app.py: separators fix
Don't run the invalid separators fix on title fields because some
items use "|" in the title to indicate something like a subtitle.

For example:

    Progress Review and Work Planning Meeting | Day 1
2023-01-24 14:13:55 +03:00
Alan Orth 5abd32a41f
CHANGELOG.md: run poetry update 2022-12-20 15:09:58 +02:00
Alan Orth 0ed0fabe21
tests/test_check.py: remove local variables
This was raised by ruff.

> F841 Local variable `result` is assigned to but never used

We don't actually need the output of the function since these tests
capture the stdout.
2022-12-20 15:09:20 +02:00
Alan Orth d5cfec65bd
tests/test_check.py: fix logic in assert
This was raised by ruff.

> E711 Comparison to `None` should be `cond is None`
2022-12-20 15:07:41 +02:00
Alan Orth 66893753ba
Move isort config to pyproject.toml
See: https://pycqa.github.io/isort/docs/configuration/black_compatibility.html
2022-12-20 15:03:10 +02:00
Alan Orth 57be05ebb6
poetry.lock: run poetry update 2022-12-20 14:59:35 +02:00
Alan Orth 8c23382b22
Update requirements
continuous-integration/drone/push Build is failing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2022-12-13 10:47:16 +03:00
Alan Orth f640161d87
CHANGELOG.md: add notes about SPDX and Python 2022-12-13 10:45:36 +03:00
Alan Orth e81ae93bf0
poetry.lock: run poetry update 2022-12-13 10:44:06 +03:00
Alan Orth 50ea5863dd
.drone.yml: only test on Python 3.9+ 2022-12-13 10:43:18 +03:00
Alan Orth 2dfb073b6b
Update minimum Python version to 3.9
Due to importlib.resources.files. It's a very minor thing and there
are ways to use back-ported third-party modules with this function-
ality, but I'm the only one use this so...

See: https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files
2022-12-13 10:41:32 +03:00
Alan Orth 7cc49b500d
Use licenses.json from SPDX instead of spdx-license-list
spdx-license-list has been deprecated[1] and already has outdated
information compared to recent SPDX data releases. Now I use the
JSON license data directly from SPDX[2] (currently version 3.19).

The JSON file is loaded from the package's data directory using
Python 3's stdlib functions from importlib[3], though we now need
Python 3.9 as a minimum for importlib.resources.files[4].

Also note that the data directory is not properly packaged via
setuptools, so this only works for local installs, and not via
versions published to pypi, for example (I'm currently not doing
this anyways). If I want to publish this in the future I will
need to modify setup.py/pyproject.toml to include the data files.

[1] https://gitlab.com/uniqx/spdx-license-list
[2] https://github.com/spdx/license-list-data/blob/main/json/licenses.json
[3] https://copdips.com/2022/09/adding-data-files-to-python-package-with-setup-py.html
[4] https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files
2022-12-13 10:39:17 +03:00
Alan Orth 051777bcec
Ignore subregion field for missing region checks
continuous-integration/drone/push Build is passing Details
Due to a sloppy regex I was sometimes matching the subregion field
when checking for missing UN M.49 regions in the region field.
2022-12-07 23:18:47 +01:00
Alan Orth 58e956360a
Add tests/test_check.py: fix test
continuous-integration/drone/push Build is passing Details
2022-11-28 22:12:17 +03:00
Alan Orth 3532175748
.drone.yml: install git
continuous-integration/drone/push Build is failing Details
Apparently the slim images don't come with git, which we need for
cloning some dependencies.
2022-11-28 22:05:34 +03:00
Alan Orth a7bc929af8
Update requirements
continuous-integration/drone/push Build is failing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2022-11-28 17:42:26 +03:00
Alan Orth 141b2e1da3
csv_metadata_quality/check.py: update region output
Add the country to the message about missing regions. This makes it
easier to see which country is triggering the missing region error,
and helps in case of debugging possible mistakes in the data coming
from the country_converter library.
2022-11-28 17:40:27 +03:00
Alan Orth 7097136b7e
Use my fork of country_converter again
There is an issue with the UN M.49 region for Myanmar.
2022-11-28 17:38:45 +03:00
Alan Orth d134c93663
Update requirements
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2022-11-28 17:16:09 +03:00
Alan Orth 9858406894
poetry.lock: run poetry update 2022-11-28 17:15:19 +03:00
Alan Orth b02f1f65ee
pyproject.toml: use upstream country_converter
Version 0.8.0 has the country and UN M.49 region fixes.

See: https://github.com/konstantinstadler/country_converter/releases/tag/v0.8.0
2022-11-28 17:14:16 +03:00
Alan Orth 4d5ef38dde
pyproject.toml: add ipython to dev dependencies 2022-11-28 17:11:18 +03:00
Alan Orth eaa8f31faf
Update requirements
continuous-integration/drone/push Build is failing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2022-11-08 10:22:39 +03:00
Alan Orth df57988e5a
Use my fork of pycountry
Until they update to iso-codes 4.12.0.

See: https://github.com/flyingcircusio/pycountry/pull/149
2022-11-08 10:21:28 +03:00
Alan Orth bddf4da559
Update requirements
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2022-11-08 10:06:26 +03:00
Alan Orth 15f52f8be8
Switch to my fork of country-converter
Until a few issues are resolved regarding new countries and regions.

See: https://github.com/konstantinstadler/country_converter/pull/122
See: https://github.com/konstantinstadler/country_converter/pull/123
2022-11-08 10:04:31 +03:00
Alan Orth bc909464c7
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2022-11-07 12:14:46 +03:00
Alan Orth 2d46259dfe
poetry.lock: run poetry update 2022-11-07 12:13:44 +03:00
Alan Orth ca82820a8e
pyproject.toml: update dependencies to latest 2022-11-07 12:13:28 +03:00
Alan Orth 86b4e5e182
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==
2022-11-01 12:21:41 +03:00
Alan Orth e5d5ae7e5d
poetry.lock: run poetry update 2022-11-01 12:20:43 +03:00
Alan Orth 8f3db86a36
CHANGELOG.md: fix header
continuous-integration/drone/push Build is passing Details
2022-10-31 11:43:14 +03:00
Alan Orth b0721b0a7a
.github: use ubuntu-22.04 for actions
continuous-integration/drone/push Build is passing Details
Apparently 'ubuntu-latest' is still 20.04 and today is 2022-10-03,
which seems a bit old!

See: https://github.com/actions/runner-images
2022-10-03 19:49:24 +03:00