1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-11-14 01:57:03 +01:00
Commit Graph

69 Commits

Author SHA1 Message Date
c1f630c298
Bump dependencies
All tests passing...
2024-06-18 22:17:38 +03:00
82b056f0ea
Use py3langid v0.3.0 2024-06-18 21:51:32 +03:00
f6c6c94a1e
pyproject.toml: use ~= for dependencies
These are the closest to semantic versioning in Python that I can
find with PEP 621 syntax. For example:

> ~=3.1: version 3.1 or later, but not version 4.0 or later.
> ~=3.1.2: version 3.1.2 or later, but not version 3.2.0 or later.

For most cases I want to bump the minor and micro / patch.

See: https://packaging.python.org/en/latest/specifications/version-specifiers/#examples
2024-05-23 10:01:46 +03:00
f500fac64b
pyproject.toml: remove scalene from dev deps 2024-05-23 10:00:01 +03:00
8143a7d978
pyproject.toml: align better with PEP 621
This PEP was approved years ago and has become a standard for the
way pyproject.toml file is laid out. We need to make some changes
to the license, URLs, add classifiers, etc.

See: https://peps.python.org/pep-0621/
2024-05-23 09:44:16 +03:00
94cec080d6
pyproject.toml: remove Hatch direct-references
Apparently I copied this from somewhere but it's not needed in this
project because we are not using direct dependency references (which
seem to be local packages).
2024-05-23 09:43:08 +03:00
9402af1e30
pyproject.toml: add comment about packages
Important for Hatch.
2024-05-23 09:42:11 +03:00
d71ff9082b
pyproject.toml: add comment about backend 2024-05-23 09:41:08 +03:00
4d879f6d13
pyproject.toml: remove black
rye bundles ruff so we can use that instead via `rye fmt`.
2024-05-22 23:19:20 +03:00
a30fefcd52
pyproject.toml: update formatting
rye requires a slightly different formatting.
2024-05-22 23:19:14 +03:00
renovate[bot]
df040b70c7
chore(deps): update dependency flake8 to v7
All checks were successful
continuous-integration/drone/push Build is passing
2024-01-05 00:58:28 +00:00
a21ffb0fa8
Use py3langid instead of langid
Faster and more modern code for Python 3 as a drop-in replacement.

See: https://adrien.barbaresi.eu/blog/language-detection-langid-py-faster.html
2023-12-28 14:11:21 +03:00
59b3b307c9
pyproject.toml: use official pycountry
The project is moving again and has all the latest data from the
iso-codes project.
2023-12-09 12:04:14 +03:00
80c3f5b45a
Add fixit to dev dependencies 2023-11-22 21:54:09 +03:00
renovate[bot]
58d4de973e
Update dependency country-converter to ~1.1.0
Some checks failed
continuous-integration/drone/push Build is failing
2023-11-20 18:37:44 +00:00
4ed2786703
pyproject.toml: update pycountry
Use the latest branch in my fork that has iso-codes 4.15.0.
2023-10-15 21:53:09 +03:00
renovate[bot]
3632ae0fc9
Update dependency requests-cache to v1
All checks were successful
continuous-integration/drone/push Build is passing
2023-05-29 19:25:58 +00:00
bc470a4343
pyproject.toml: rework pandas and pyarrow
We don't explicitly depend on PyArrow. It should come as a pandas
extra. I installed it like this:

    $ poetry add pandas=="^2.0.2[feather,performance]"

See: https://pandas.pydata.org/docs/getting_started/install.html#other-data-sources
2023-05-29 22:24:04 +03:00
2e55b4d6e3
pyproject.toml: add pyarrow explicitly
CI was failing because pyarrow is not an extra provided by pandas.
Indeed, according to the docs the named extras installing pyarrow
are actually feather and parquet, so we need to install pyarrow
explicitly.

See: https://pandas.pydata.org/pandas-docs/version/2.0/getting_started/install.html#install-dependencies
2023-04-05 12:49:40 +03:00
c90aad29f0
Use poetry dev group
This is the new syntax since Poetry 1.2.0.

See: https://python-poetry.org/docs/managing-dependencies/#installing-group-dependencies
2023-04-05 12:37:03 +03:00
6fd1e1377f
Add pyarrow extra to Python Pandas deps 2023-04-05 11:40:22 +03:00
b5106de9df
pyproject.toml: Pandas 2.0.0 2023-04-05 11:15:40 +03:00
d4aed378cf
Switch to pandas 2.0.0rc1
Seems to work fine with the new PyArrow datatypes.
2023-03-22 12:16:56 +03:00
ede37569f1
pyproject.toml: use pycountry with iso-codes 4.13.0 2023-03-04 07:33:48 +03:00
4776154d6c
pyproject.toml: switch back to upstream country_converter
Version 1.0.0 incorporates my change to Myanmar.

See: https://github.com/IndEcol/country_converter/releases/tag/v1.0.0
2023-03-04 06:52:56 +03:00
fdccdf7318
Version 0.6.1
Some checks failed
continuous-integration/drone/push Build is failing
2023-02-23 13:46:56 +03:00
03d824b78e
pyproject.toml: update some dependencies 2023-02-18 22:09:05 +03:00
66893753ba
Move isort config to pyproject.toml
See: https://pycqa.github.io/isort/docs/configuration/black_compatibility.html
2022-12-20 15:03:10 +02:00
2dfb073b6b
Update minimum Python version to 3.9
Due to importlib.resources.files. It's a very minor thing and there
are ways to use back-ported third-party modules with this function-
ality, but I'm the only one use this so...

See: https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files
2022-12-13 10:41:32 +03:00
7cc49b500d
Use licenses.json from SPDX instead of spdx-license-list
spdx-license-list has been deprecated[1] and already has outdated
information compared to recent SPDX data releases. Now I use the
JSON license data directly from SPDX[2] (currently version 3.19).

The JSON file is loaded from the package's data directory using
Python 3's stdlib functions from importlib[3], though we now need
Python 3.9 as a minimum for importlib.resources.files[4].

Also note that the data directory is not properly packaged via
setuptools, so this only works for local installs, and not via
versions published to pypi, for example (I'm currently not doing
this anyways). If I want to publish this in the future I will
need to modify setup.py/pyproject.toml to include the data files.

[1] https://gitlab.com/uniqx/spdx-license-list
[2] https://github.com/spdx/license-list-data/blob/main/json/licenses.json
[3] https://copdips.com/2022/09/adding-data-files-to-python-package-with-setup-py.html
[4] https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files
2022-12-13 10:39:17 +03:00
7097136b7e
Use my fork of country_converter again
There is an issue with the UN M.49 region for Myanmar.
2022-11-28 17:38:45 +03:00
b02f1f65ee
pyproject.toml: use upstream country_converter
Version 0.8.0 has the country and UN M.49 region fixes.

See: https://github.com/konstantinstadler/country_converter/releases/tag/v0.8.0
2022-11-28 17:14:16 +03:00
4d5ef38dde
pyproject.toml: add ipython to dev dependencies 2022-11-28 17:11:18 +03:00
df57988e5a
Use my fork of pycountry
Until they update to iso-codes 4.12.0.

See: https://github.com/flyingcircusio/pycountry/pull/149
2022-11-08 10:21:28 +03:00
15f52f8be8
Switch to my fork of country-converter
Until a few issues are resolved regarding new countries and regions.

See: https://github.com/konstantinstadler/country_converter/pull/122
See: https://github.com/konstantinstadler/country_converter/pull/123
2022-11-08 10:04:31 +03:00
ca82820a8e
pyproject.toml: update dependencies to latest 2022-11-07 12:13:28 +03:00
58b7b6e9d8
Version 0.6.0
All checks were successful
continuous-integration/drone/push Build is passing
2022-09-02 16:35:58 +03:00
21e9948a75
pyproject.toml: manually updated all deps
Update all deps to their latest versions on pypi.org and remove the
explicit dependency on SQLAlchemy.
2022-09-02 16:30:40 +03:00
566c2b45cf
Remove Excel support
I never used this and it seems xlrd doesn't even support .xlsx any-
more anyways. If this was needed I could theoretically use openpyxl
but I'd rather just stick to CSV.
2022-09-02 16:14:24 +03:00
b0d46cd864
pyproject.toml: update black
It's no longer in beta!
2022-01-30 13:22:47 +03:00
3ee9319d84
pyproject.toml: bump flake8 2022-01-30 13:21:09 +03:00
4d5f4b5abb
pyproject.toml: update pycountry
Seems to be a few major versions from 19.x.x to 21.x.x. All tests
passing in pytest so it's probably fine.
2022-01-30 13:15:38 +03:00
98d38801fa
pyproject.toml: update requests and requests-cache 2022-01-30 13:11:01 +03:00
e94a4539bf
pyproject.toml: bump Pandas to v1.4.0
As of Pandas v1.4.0 the minimum Python version is 3.8.

See: https://pandas.pydata.org/docs/whatsnew/v1.4.0.html
2022-01-30 13:03:56 +03:00
d9e427a80e
pyproject.toml: don't install ipython
It always complains about running in a virtual environment anyways,
and I can use the one from the OS instead.
2022-01-29 16:25:58 +03:00
8b15154285
pyproject.toml: use ftfy 6.0
Lots of improvements here! Improvements to heuristics and a new way
to configure which fixes get applied.

See: https://github.com/rspeer/python-ftfy/blob/master/CHANGELOG.md#version-60-april-2-2021
2021-12-15 21:48:56 +02:00
9905e183ea
Bump version to 0.6.0-dev 2021-12-09 23:21:30 +02:00
cc34db7ff8
Version 0.5.0
All checks were successful
continuous-integration/drone/push Build is passing
2021-12-08 15:29:46 +02:00
ccc2a73456
Add check for countries without matching regions
If we have country "Kenya" we should have region "Eastern Africa"
according to the UN M.49 geolocation scheme.
2021-12-08 15:02:20 +02:00
215d61c188
pyproject.toml: limit SQLAlchemy to < 1.4.23
SQLAlchemy gets pulled in by csvkit's agate-sql dependency and there
is currently an issue with Poetry's parsing of the SQLAlchemy 1.4.23
constraints. Temporarily explicitly install a version of SQLAlchemy
that works (can remove later once Poetry fixes this). Anyways, I am
not using any SQLAlchemy features that I know of.

See: https://github.com/python-poetry/poetry/issues/4402
2021-09-06 21:01:09 +03:00