1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-11-27 16:18:19 +01:00
Commit Graph

92 Commits

Author SHA1 Message Date
f3fb1ff7fb Don't crash when title is missing
We shouldn't crash the country/region checker/fixer when the title
field is missing, since we only use it to show status to the user.
2023-06-12 10:42:50 +03:00
8d4295b2b3
CHANGELOG.md: add note about description field 2023-04-22 12:17:44 -07:00
c64b7eb1f1
CHANGELOG.md: add note about Pandas 2.0.0 2023-04-05 11:17:48 +03:00
20a2cce34b
CHANGELOG.md: add fixes
Some checks failed
continuous-integration/drone/push Build is failing
2023-03-10 16:17:20 +03:00
fdccdf7318
Version 0.6.1
Some checks failed
continuous-integration/drone/push Build is failing
2023-02-23 13:46:56 +03:00
8bc4cd419c
Strip filename descriptions before checking
Some checks failed
continuous-integration/drone/push Build is failing
When checking for uncommon file extensions in the filename field
we should strip descriptions that are meant for SAF Bundler, for
example: Annual_Report_2020.pdf__description:Report. This ends up
as a false positive that spams the output with warnings.
2023-02-13 11:00:57 +03:00
bde38e9ed4
CHANGELOG.md: add notes about abstracts 2023-02-13 10:39:03 +03:00
fbb625be5c
Ignore common non-SPDX licenses
This is meant to catch licenses that are supposed to be SPDX but
aren't, not licenses that *aren't* supposed to be SPDX. We have so
many free-text license descriptions like "Copyrighted" and "Other"
that I'm sick of seeing warnings for them!
2023-02-07 17:01:56 +03:00
084b970798
CHANGELOG.md: add note about abstract field 2023-02-07 16:52:34 +03:00
c4a2ee8563
CHANGELOG.md: add note about fix.separators() 2023-01-24 14:16:23 +03:00
5abd32a41f
CHANGELOG.md: run poetry update 2022-12-20 15:09:58 +02:00
f640161d87
CHANGELOG.md: add notes about SPDX and Python 2022-12-13 10:45:36 +03:00
051777bcec
Ignore subregion field for missing region checks
All checks were successful
continuous-integration/drone/push Build is passing
Due to a sloppy regex I was sometimes matching the subregion field
when checking for missing UN M.49 regions in the region field.
2022-12-07 23:18:47 +01:00
8f3db86a36
CHANGELOG.md: fix header
All checks were successful
continuous-integration/drone/push Build is passing
2022-10-31 11:43:14 +03:00
58b7b6e9d8
Version 0.6.0
All checks were successful
continuous-integration/drone/push Build is passing
2022-09-02 16:35:58 +03:00
566c2b45cf
Remove Excel support
I never used this and it seems xlrd doesn't even support .xlsx any-
more anyways. If this was needed I could theoretically use openpyxl
but I'd rather just stick to CSV.
2022-09-02 16:14:24 +03:00
41b813be6e
CHANGELOG.md: add not about exclude logic 2022-09-02 16:03:51 +03:00
da87531779
CHANGELOG.md: Add note about adding missing regions 2022-07-28 16:54:05 +03:00
e1b270cf83
CHANGELOG.md: add note about dropping invalid AGROVOC values
All checks were successful
continuous-integration/drone/push Build is passing
2021-12-23 12:47:42 +02:00
a351ba9706
CHANGELOG.md: add notes about ftfy 2021-12-15 22:09:01 +02:00
5854f8e865
CHANGELOG.md: add note about unnecessary Unicode 2021-12-15 13:56:31 +02:00
cef6c66b30
CHANGELOG.md: start next changes 2021-12-09 23:21:58 +02:00
cc34db7ff8
Version 0.5.0
All checks were successful
continuous-integration/drone/push Build is passing
2021-12-08 15:29:46 +02:00
b79e07b814
CHANGELOG.md: Add note about countries without regions 2021-12-08 15:21:45 +02:00
f5fa33bbc6
CHANGELOG.md: add title in citation note 2021-12-05 16:23:39 +02:00
c95261f522
CHANGELOG.md: Add note about fix.newlines
All checks were successful
continuous-integration/drone/push Build is passing
2021-10-08 14:37:12 +03:00
831ce979c3
CHANGELOG.md: Clarify regex fixes 2021-10-06 21:23:35 +03:00
72dd3e7272
CHANGELOG.md: Add notes about regexes 2021-10-06 19:35:59 +03:00
81069259ba
CHANGELOG.md: Add note about bibliographicCitation
All checks were successful
continuous-integration/drone/push Build is passing
2021-10-06 16:16:51 +03:00
dbc0437d59
CHANGELOG.md: Add note about Python deps
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-14 16:16:02 +03:00
a04dbc50db
Add notes about checking and fixing mojibake 2021-03-19 11:48:27 +02:00
f816e17fe7
Version 0.4.7
All checks were successful
continuous-integration/drone/push Build is passing
2021-03-17 10:00:34 +02:00
652b7ea98c
CHANGELOG.md: Add note about poetry dependencies 2021-03-17 09:58:02 +02:00
a313b7527a
CHANGELOG.md: Add note about duplicate items 2021-03-17 09:55:07 +02:00
1aa2084230
CHANGELOG.md: Add note about checks 2021-03-16 16:11:24 +02:00
ed084da08c
CHANGELOG.md: Add note about multi-value separators
All checks were successful
continuous-integration/drone/push Build is passing
2021-03-14 21:04:19 +02:00
fb35afd937
CHANGELOG.md: Add note about requests cache 2021-03-14 09:13:51 +02:00
1008acf35e
Always fix invalid multi-value separators
All checks were successful
continuous-integration/drone/push Build is passing
This is no longer class-ified as "unsafe" as I have yet to see a
case where this was intentional, and it always causes issues when
you import the data in a DSpace repository.
2021-03-13 12:59:45 +02:00
1554cfd5c9
Version 0.4.6 2021-03-11 12:14:54 +02:00
00b8faad6d
CHANGELOG.md: Fix headers 2021-03-11 12:13:22 +02:00
7ad821dcad
CHANGELOG.md: Add note about poetry dependencies 2021-03-11 11:10:27 +02:00
e0e3ca6c58
CHANGELOG.md: Add notes about DCTERMS in data/test.csv 2021-03-11 10:50:52 +02:00
d7d4d4efca
CHANGELOG.md: Add note about SPDX license identifiers 2021-03-11 10:37:27 +02:00
202bda862a
Bump version to 0.4.5
All checks were successful
continuous-integration/drone/push Build is passing
2021-03-04 21:38:10 +02:00
fc5bedcc5c
CHANGELOG.md: Add poetry update 2021-03-04 21:32:46 +02:00
27b2d81ca8
CHANGELOG.md: Add note about dcterms.issued
All checks were successful
continuous-integration/drone/push Build is passing
2021-02-28 15:14:39 +02:00
d76e72532a
Move unreleased changes to v0.4.4
All checks were successful
continuous-integration/drone/push Build is passing
2021-02-21 13:25:22 +02:00
13980d2dde
CHANGELOG.md: Add note about colored output 2021-02-21 13:12:26 +02:00
202abf140c
CHANGELOG.md: Add note about poetry
All checks were successful
continuous-integration/drone/push Build is passing
2021-02-04 21:48:12 +02:00
e62ecb0a8f
CHANGELOG.md: Add note about new date format 2021-02-04 21:43:44 +02:00