csv-metadata-quality

mirror of https://github.com/ilri/csv-metadata-quality.git synced 2025-09-05 03:22:36 +02:00

Author	SHA1	Message	Date
Alan Orth	d40c9ed97a	poetry.lock: run poetry update	2023-01-24 14:17:44 +03:00
Alan Orth	c4a2ee8563	CHANGELOG.md: add note about fix.separators()	2023-01-24 14:16:23 +03:00
Alan Orth	3596381d03	csv_metadata_quality/app.py: separators fix Don't run the invalid separators fix on title fields because some items use "\|" in the title to indicate something like a subtitle. For example: Progress Review and Work Planning Meeting \| Day 1	2023-01-24 14:13:55 +03:00
Alan Orth	5abd32a41f	CHANGELOG.md: run poetry update	2022-12-20 15:09:58 +02:00
Alan Orth	0ed0fabe21	tests/test_check.py: remove local variables This was raised by ruff. > F841 Local variable `result` is assigned to but never used We don't actually need the output of the function since these tests capture the stdout.	2022-12-20 15:09:20 +02:00
Alan Orth	d5cfec65bd	tests/test_check.py: fix logic in assert This was raised by ruff. > E711 Comparison to `None` should be `cond is None`	2022-12-20 15:07:41 +02:00
Alan Orth	66893753ba	Move isort config to pyproject.toml See: https://pycqa.github.io/isort/docs/configuration/black_compatibility.html	2022-12-20 15:03:10 +02:00
Alan Orth	57be05ebb6	poetry.lock: run poetry update	2022-12-20 14:59:35 +02:00
Alan Orth	8c23382b22	Update requirements Some checks failed continuous-integration/drone/push Build is failing Details Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-12-13 10:47:16 +03:00
Alan Orth	f640161d87	CHANGELOG.md: add notes about SPDX and Python	2022-12-13 10:45:36 +03:00
Alan Orth	e81ae93bf0	poetry.lock: run poetry update	2022-12-13 10:44:06 +03:00
Alan Orth	50ea5863dd	.drone.yml: only test on Python 3.9+	2022-12-13 10:43:18 +03:00
Alan Orth	2dfb073b6b	Update minimum Python version to 3.9 Due to importlib.resources.files. It's a very minor thing and there are ways to use back-ported third-party modules with this function- ality, but I'm the only one use this so... See: https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files	2022-12-13 10:41:32 +03:00
Alan Orth	7cc49b500d	Use licenses.json from SPDX instead of spdx-license-list spdx-license-list has been deprecated[1] and already has outdated information compared to recent SPDX data releases. Now I use the JSON license data directly from SPDX[2] (currently version 3.19). The JSON file is loaded from the package's data directory using Python 3's stdlib functions from importlib[3], though we now need Python 3.9 as a minimum for importlib.resources.files[4]. Also note that the data directory is not properly packaged via setuptools, so this only works for local installs, and not via versions published to pypi, for example (I'm currently not doing this anyways). If I want to publish this in the future I will need to modify setup.py/pyproject.toml to include the data files. [1] https://gitlab.com/uniqx/spdx-license-list [2] https://github.com/spdx/license-list-data/blob/main/json/licenses.json [3] https://copdips.com/2022/09/adding-data-files-to-python-package-with-setup-py.html [4] https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files	2022-12-13 10:39:17 +03:00
Alan Orth	051777bcec	Ignore subregion field for missing region checks All checks were successful continuous-integration/drone/push Build is passing Details Due to a sloppy regex I was sometimes matching the subregion field when checking for missing UN M.49 regions in the region field.	2022-12-07 23:18:47 +01:00
Alan Orth	58e956360a	Add tests/test_check.py: fix test All checks were successful continuous-integration/drone/push Build is passing Details	2022-11-28 22:12:17 +03:00
Alan Orth	3532175748	.drone.yml: install git Some checks failed continuous-integration/drone/push Build is failing Details Apparently the slim images don't come with git, which we need for cloning some dependencies.	2022-11-28 22:05:34 +03:00
Alan Orth	a7bc929af8	Update requirements Some checks failed continuous-integration/drone/push Build is failing Details Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-11-28 17:42:26 +03:00
Alan Orth	141b2e1da3	csv_metadata_quality/check.py: update region output Add the country to the message about missing regions. This makes it easier to see which country is triggering the missing region error, and helps in case of debugging possible mistakes in the data coming from the country_converter library.	2022-11-28 17:40:27 +03:00
Alan Orth	7097136b7e	Use my fork of country_converter again There is an issue with the UN M.49 region for Myanmar.	2022-11-28 17:38:45 +03:00
Alan Orth	d134c93663	Update requirements Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-11-28 17:16:09 +03:00
Alan Orth	9858406894	poetry.lock: run poetry update	2022-11-28 17:15:19 +03:00
Alan Orth	b02f1f65ee	pyproject.toml: use upstream country_converter Version 0.8.0 has the country and UN M.49 region fixes. See: https://github.com/konstantinstadler/country_converter/releases/tag/v0.8.0	2022-11-28 17:14:16 +03:00
Alan Orth	4d5ef38dde	pyproject.toml: add ipython to dev dependencies	2022-11-28 17:11:18 +03:00
Alan Orth	eaa8f31faf	Update requirements Some checks failed continuous-integration/drone/push Build is failing Details Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-11-08 10:22:39 +03:00
Alan Orth	df57988e5a	Use my fork of pycountry Until they update to iso-codes 4.12.0. See: https://github.com/flyingcircusio/pycountry/pull/149	2022-11-08 10:21:28 +03:00
Alan Orth	bddf4da559	Update requirements Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-11-08 10:06:26 +03:00
Alan Orth	15f52f8be8	Switch to my fork of country-converter Until a few issues are resolved regarding new countries and regions. See: https://github.com/konstantinstadler/country_converter/pull/122 See: https://github.com/konstantinstadler/country_converter/pull/123	2022-11-08 10:04:31 +03:00
Alan Orth	bc909464c7	Update requirements All checks were successful continuous-integration/drone/push Build is passing Details Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-11-07 12:14:46 +03:00
Alan Orth	2d46259dfe	poetry.lock: run poetry update	2022-11-07 12:13:44 +03:00
Alan Orth	ca82820a8e	pyproject.toml: update dependencies to latest	2022-11-07 12:13:28 +03:00
Alan Orth	86b4e5e182	Update requirements All checks were successful continuous-integration/drone/push Build is passing Details Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-11-01 12:21:41 +03:00
Alan Orth	e5d5ae7e5d	poetry.lock: run poetry update	2022-11-01 12:20:43 +03:00
Alan Orth	8f3db86a36	CHANGELOG.md: fix header All checks were successful continuous-integration/drone/push Build is passing Details	2022-10-31 11:43:14 +03:00
Alan Orth	b0721b0a7a	.github: use ubuntu-22.04 for actions All checks were successful continuous-integration/drone/push Build is passing Details Apparently 'ubuntu-latest' is still 20.04 and today is 2022-10-03, which seems a bit old! See: https://github.com/actions/runner-images	2022-10-03 19:49:24 +03:00
Alan Orth	4e5faf51bd	.github/workflows: use pip caching See: https://github.com/actions/setup-python/blob/main/docs/advanced-usage.md#caching-packages	2022-10-03 19:39:52 +03:00
Alan Orth	5ea38d65bd	.github/workflows: update actions Update actions to latest versions: - actions/checkout@v3 - actions/setup-python@v4	2022-10-03 19:39:52 +03:00
Alan Orth	58b7b6e9d8	Version 0.6.0 All checks were successful continuous-integration/drone/push Build is passing Details v0.6.0	2022-09-02 16:35:58 +03:00
Alan Orth	ffdf1eca7b	setup.py: remove Python 3.7 support I had already set the minimum to Python 3.8 elsewhere, but forgot to do it here. I am not sure if Python 3.7 will still work here or not so let's just keep it in sync with the other docs.	2022-09-02 16:34:16 +03:00
Alan Orth	59742e47f1	Update requirements Generated with poetry export: $ poetry export --without-hashes -f requirements.txt > requirements.txt $ poetry export --without-hashes --with dev -f requirements.txt > requirements-dev.txt I am trying `--without-hashes` to work around an error on pip install when running in CI: ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==	2022-09-02 16:32:04 +03:00
Alan Orth	9c741b1d49	poetry.lock: sync latest deps	2022-09-02 16:31:19 +03:00
Alan Orth	21e9948a75	pyproject.toml: manually updated all deps Update all deps to their latest versions on pypi.org and remove the explicit dependency on SQLAlchemy.	2022-09-02 16:30:40 +03:00
Alan Orth	f64435fc9d	tests/test_check.py: add missing excludes	2022-09-02 16:24:33 +03:00
Alan Orth	566c2b45cf	Remove Excel support I never used this and it seems xlrd doesn't even support .xlsx any- more anyways. If this was needed I could theoretically use openpyxl but I'd rather just stick to CSV.	2022-09-02 16:14:24 +03:00
Alan Orth	41b813be6e	CHANGELOG.md: add not about exclude logic	2022-09-02 16:03:51 +03:00
Alan Orth	040e56fc76	Improve exclude function When a user explicitly requests that a field be excluded with -x we skip that field in most checks. Up until now that did not include the item-based checks using a transposed dataframe because we don't know the metadata field names (labels) until we iterate over them. Now the excludes are respected for item-based checks.	2022-09-02 15:59:22 +03:00
Alan Orth	1f76247353	csv_metadata_quality/app.py: rework exclude/skip Instead of processing the excludes inside the for column loop we do it once before and then only need to check if the current column is in the list.	2022-09-02 10:35:04 +03:00
Alan Orth	2e489fc921	Add new data/test-geography.csv test file All checks were successful continuous-integration/drone/push Build is passing Details This file has metadata to test different scenarios related to chec- king and fixing missing regions.	2022-09-01 16:57:29 +03:00
Alan Orth	117c6ca85d	csv_metadata_quality/check.py: missing region fixes Port over the recent fixes and logic improvements to regions from fix.py.	2022-09-01 16:38:35 +03:00
Alan Orth	f49214fa2e	csv_metadata_quality/fix.py: fix bug in regions We need to make sure we're only manipulating the regions if we have any missing. The previous code was always manipulating the existing row, even when there were no missing regions, which resulted in new values like "Eastern Africa\|\|".	2022-09-01 16:15:32 +03:00

1 2 3 4 5 ...

504 Commits