1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-11-10 16:25:45 +01:00
csv-metadata-quality/data/test.csv
Alan Orth abae8ca4fb
data/test.csv: Move some DC fields to DCTERMS
The original Dublin Core elements set was superceded by DCTERMS in
2008 and we have started using them in our DSpace repository so I
think it's good to update them in our test data. Old DC fields are
still checked and fixed in this tool, though.

It's worth nothing that currently supported DSpace versions (4, 5,
and 6) all have hard-coded a few fields like dc.title internally so
we can't migrate those to their DCTERMS counterparts just yet.
2021-03-11 10:49:05 +02:00

1.5 KiB
Raw Blame History

1dc.titledcterms.issueddc.identifier.issndc.identifier.isbndcterms.languagedcterms.subjectcg.coverage.countryfilenamedcterms.license
2Leading space2019-07-29
3Trailing space 2019-07-29
4Excessive space2019-07-29
5Miscellaenous ||whitespace | issues 2019-07-29
6Duplicate||Duplicate2019-07-29
7Invalid ISSN2019-07-292321-2302
8Invalid ISBN2019-07-29978-0-306-40615-6
9Multiple valid ISSNs2019-07-290378-5955||0024-9319
10Multiple valid ISBNs2019-07-2999921-58-10-7||978-0-306-40615-7
11Invalid date2019-07-260
12Multiple dates2019-07-26||2019-01-10
13Invalid multi-value separator2019-07-290378-5955|0024-9319
14Unnecessary Unicode2019-07-29
15Suspicious character||foreˆt2019-07-29
16Invalid ISO 639-1 (alpha 2) language2019-07-29jp
17Invalid ISO 639-3 (alpha 3) language2019-07-29chi
18Invalid language2019-07-29Span
19Invalid AGROVOC subject2019-07-29FOREST
20Newline (LF)2019-07-30TANZA NIA
21Missing date
22Invalid country2019-08-01KENYAA
23Uncommon filename extension2019-08-10file.pdf.lck
24Unneccesary unicode (U+002D + U+00AD)2019-08-10978-­92-­9043-­823-­6
25Missing space,after comma2019-08-27
26Incorrect ISO 639-1 language2019-09-26es
27Incorrect ISO 639-3 language2019-09-26spa
28Composéd Unicode2020-01-14
29Decomposéd Unicode2020-01-14
30Unnecessary multi-value separator2021-01-030378-5955||
31Invalid SPDX license identifier2021-03-11CC-BY