Commit Graph

400 Commits

Author SHA1 Message Date
Alan Orth ad33195ba3
README.md: adjust intro
continuous-integration/drone/push Build is passing Details
Makes the badges not wrap and looks better in my opinion.
2021-12-08 11:36:34 +02:00
Alan Orth 72fe38972e
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-12-05 16:29:37 +02:00
Alan Orth 04232d0ede
poetry.lock: run poetry update 2021-12-05 16:29:09 +02:00
Alan Orth f5fa33bbc6
CHANGELOG.md: add title in citation note 2021-12-05 16:23:39 +02:00
Alan Orth 1b978159c1
data/text.csv: Add data for title in citation test 2021-12-05 16:23:06 +02:00
Alan Orth 4d5696c4cb
csv_metadata_quality/check.py: update title in citation check
Initialize the titles and citations before the for loop so we can
access them later. This makes it easier to check if the item actua-
lly has a citation.
2021-12-05 16:21:44 +02:00
Alan Orth e02678cd7c
tests/test_check.py: add tests for title in citation 2021-12-05 16:01:11 +02:00
Alan Orth 01b4354a14
tests/test_check.py: fix comment 2021-12-05 15:58:25 +02:00
Alan Orth 3b40a68279
Add check for title in citation
This checks if the item title exists in the citation. If it is not
present it could just be missing, or could have minor differences
in the whitespace, accents, etc.
2021-12-05 15:52:42 +02:00
Alan Orth 999cc65097
csv_metadata_quality/app.py: adjust mojibake check
If unsafe fixes (-u) are enabled then we don't need to do the check
first before actually fixing them. Doing the check first creates e-
tra output that needs to be reviewed by the user.
2021-12-05 15:18:35 +02:00
Alan Orth a7c3be280d
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-11-27 12:26:21 +02:00
Alan Orth 69f68e0a72
poetry.lock: Run poetry update 2021-11-27 12:25:40 +02:00
Alan Orth c941a90944 .drone.yml: Test on Python 3.10
continuous-integration/drone/push Build is passing Details
2021-10-11 20:09:32 +03:00
Alan Orth c95261f522
CHANGELOG.md: Add note about fix.newlines
continuous-integration/drone/push Build is passing Details
2021-10-08 14:37:12 +03:00
Alan Orth 787fa9e8d9
Add field name to fix.newlines output 2021-10-08 14:36:43 +03:00
Alan Orth 82261f7fe0
tests/test_check.py: Run black
continuous-integration/drone/push Build is passing Details
2021-10-06 22:10:26 +03:00
Alan Orth 8a27fb2589
Add check for missing DOIs
continuous-integration/drone/push Build is passing Details
Sometimes an editor includes a DOI in the citation field, but does
not add a standalone DOI field.
2021-10-06 21:25:39 +03:00
Alan Orth 831ce979c3
CHANGELOG.md: Clarify regex fixes 2021-10-06 21:23:35 +03:00
Alan Orth 58ef62fbcd
Update requirements
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-10-06 21:20:35 +03:00
Alan Orth 8c59f57e76
poetry.lock: Run poetry update 2021-10-06 21:19:54 +03:00
Alan Orth 72dd3e7272
CHANGELOG.md: Add notes about regexes 2021-10-06 19:35:59 +03:00
Alan Orth 6ba16d5d4c
csv_metadata_quality/check.py: Fix duplicate checker
Fix the incorrect type field regex, and improve the title regex to
consider dcterms.title and dc.title (along with the DSpace language
variants like dc.title[en_US]), but ignore dc.title.alternative.

See: https://regex101.com/r/I4m06F/1
2021-10-06 19:32:40 +03:00
Alan Orth 81069259ba
CHANGELOG.md: Add note about bibliographicCitation
continuous-integration/drone/push Build is passing Details
2021-10-06 16:16:51 +03:00
Alan Orth 54ab869297
csv_metadata_quality/experimental.py: Adjust citation match
We need to match both of these citation fields:

- dc.identifier.citation
- dcterms.bibliographicCitation
2021-10-06 16:13:10 +03:00
Alan Orth 22b359c8a8
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-09-27 14:15:01 +03:00
Alan Orth 3e06788d88
poetry.lock: Run poetry update 2021-09-27 14:11:21 +03:00
Alan Orth 3c41cc283f
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-09-06 21:04:05 +03:00
Alan Orth 5741e94571
poetry.lock: Run poetry update 2021-09-06 21:03:30 +03:00
Alan Orth 215d61c188
pyproject.toml: limit SQLAlchemy to < 1.4.23
SQLAlchemy gets pulled in by csvkit's agate-sql dependency and there
is currently an issue with Poetry's parsing of the SQLAlchemy 1.4.23
constraints. Temporarily explicitly install a version of SQLAlchemy
that works (can remove later once Poetry fixes this). Anyways, I am
not using any SQLAlchemy features that I know of.

See: https://github.com/python-poetry/poetry/issues/4402
2021-09-06 21:01:09 +03:00
Alan Orth 11ddde3327
data/test.csv: Update mojibake example
continuous-integration/drone/push Build is passing Details
I was trying to find where I got this one and it seems to have been
the other way around. Doesn't matter here only that I was curious.
2021-08-19 15:48:41 +03:00
Alan Orth a347878d43
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-08-12 21:49:36 +03:00
Alan Orth a89bc331f0
poetry.lock: Run poetry update
Lots of minor dependencies updates. All tests still passing with
pytest.
2021-08-12 21:47:46 +03:00
Alan Orth af3493c724
CITATION.cff: Remove YAML formatting
continuous-integration/drone/push Build is passing Details
GitHub says it can't parse my CITATION.cff file. The example in the
docs shows version 1.2.0 also, I wonder if that's relevant.

See: https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-repository-on-github/about-citation-files
2021-07-28 21:23:30 +03:00
Alan Orth 52644bf83e
Add CITATION.cff
Created with the cffinit tool:

https://citation-file-format.github.io/cff-initializer-javascript/
2021-07-28 21:11:11 +03:00
Alan Orth c8f5539d21
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-07-06 15:47:44 +03:00
Alan Orth 382d0d6aed
Run poetry update 2021-07-06 15:37:57 +03:00
Alan Orth b8f4be9ebb
pyproject.toml: Update pytest-clarity and black
These seem to have much newer versions that didn't get updated in
this project due to the version pinning selector I was using with
poetry.

In the case of pytest-clarity the previous version was 0.3.1 and
the version selector was a caret (^), which will never update the
left-most (major) number. Now they seem to be on 1.x.x so it will
be OK in the future.

In the case of black, they use weird numbering so it's anyone's
guess how this will work! Luckily it's only used for linting and
formatting.
2021-07-06 15:30:41 +03:00
Alan Orth 4e2eab68b0
Update requests-cache
Apparently we were stuck on an older version of requests-cache due
to the fact that we were using the caret, which will never update
the left-most (major) version. Upstream requests-cache is currently
version 0.6.4, and there seems to have been some changes to the API.
2021-07-06 15:24:39 +03:00
Alan Orth 55165cb4ce
Update requirements
continuous-integration/drone/push Build is passing Details
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==.
2021-06-14 12:52:47 +03:00
Alan Orth 93d3eabfba
poetry.lock: Run poetry update 2021-06-14 12:52:28 +03:00
Alan Orth a8fe623f4c
csv_metadata_quality/check.py: Remove unnecessary pass
continuous-integration/drone/push Build is passing Details
LGTM warned that these pass statements are not necessary.

See: https://lgtm.com/rules/910088/
2021-04-20 08:20:13 +03:00
Alan Orth dbc0437d59
CHANGELOG.md: Add note about Python deps
continuous-integration/drone/push Build is passing Details
2021-04-14 16:16:02 +03:00
Alan Orth 96ce1daa90
Update requirements
Generated with poetry export:

    $ poetry export --without-hashes -f requirements.txt > requirements.txt
    $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt

I am trying `--without-hashes` to work around an error on pip install
when running in CI:

    ERROR: In --require-hashes mode, all requirements must have
their versions pinned with ==.
2021-04-14 16:15:28 +03:00
Alan Orth 3adb52d7c0
poetry.lock: Run poetry update 2021-04-14 16:14:37 +03:00
Alan Orth f958d1879f
poetry.lock: Run poetry update
continuous-integration/drone/push Build is passing Details
2021-04-02 16:19:16 +03:00
Alan Orth bd8943f36a
csv_metadata_quality/app.py: Don't crash if fields are missing
continuous-integration/drone/push Build is passing Details
We don't need to crash if someone feeds us a CSV file that is miss-
ing commont DSpace fields like title, type, and subject.
2021-03-21 19:47:29 +02:00
Alan Orth 28f9026286
README.md: Minor edit
continuous-integration/drone/push Build is passing Details
2021-03-19 16:26:31 +02:00
Alan Orth cfe09f7126
Add SPDX short license identifier to all Python files
See: https://spdx.github.io/spdx-spec/appendix-V-using-SPDX-short-identifiers-in-source-files/
2021-03-19 16:04:40 +02:00
Alan Orth 8eddb76aab
Bump version to 0.4.8-dev
continuous-integration/drone/push Build is passing Details
2021-03-19 11:53:56 +02:00
Alan Orth a04dbc50db
Add notes about checking and fixing mojibake 2021-03-19 11:48:27 +02:00