d9fc09f121
Fix references to ISO 639
...
It turns out that ISO 639-1 is the two-letter codes, and ISO 639-2
is the three-letter codes, aka alpha2 and alpha3.
See: https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes
2019-09-11 16:36:53 +03:00
2af714fb05
README.md: Add a handful of TODOs
2019-08-27 00:12:41 +03:00
bd984f3db5
README.md: Update TravisCI badge
2019-08-22 15:07:03 +03:00
3f4e84a638
README.md: Use ILRI GitHub remote
2019-08-22 14:54:12 +03:00
a00d3d7ea5
README.md: Simplify installation instructions
...
Pipenv has captured the local dependency with `-e .` so now it gets
installed by the Pipfile or requirements.txt.
2019-08-02 11:02:50 +03:00
0ed390dbd5
README.md: Update AGROVOC information
...
Now details the new `--agrovoc-fields` option.
2019-08-01 23:54:40 +03:00
fd3861e7cd
README.md: Update installation and usage instructions
...
It is much easier now that I have created a proper package.
2019-07-31 17:41:18 +03:00
4c4f4a3ba2
README.md: Update todos
2019-07-31 16:33:49 +03:00
22cc7bc793
README.md: Improve section on unsafe fixes
2019-07-31 16:00:05 +03:00
40d5f7d81b
Add support for removing newlines
...
This was tricky because of the nature of newlines. In actuality we
are removing Unix line feeds here (U+000A) because Windows carriage
returns are actually already removed by the string stripping in the
whitespace fix.
Creating the test case in Vim was difficult because I couldn't fig-
ure out how to manually enter a line feed character. In the end I
used a search and replace on a known pattern like "ALAN", replacing
it with \r. Neither entering the Unicode code point (U+000A) direc-
tly or typing an "Enter" character after ^V worked. Grrr.
2019-07-30 20:05:12 +03:00
346e66ca98
README.md: Add more information to introduction
2019-07-30 17:44:30 +03:00
a85b410ab9
README.md: Improve introduction and functionality
2019-07-30 16:09:15 +03:00
1f65a28307
Add support for validating subjects against AGROVOC
...
Checks values in the dc.subject or dcterms.subject field against the
AGROVOC REST API hosted by FAO. Code borrowed from agrovoc-lookup.py.
See: http://agrovoc.uniroma2.it/agrovoc/agrovoc/en/
See: https://github.com/ilri/DSpace/blob/5_x-prod/agrovoc-lookup.py
2019-07-30 00:30:31 +03:00
a36454a3ac
Add support for validating languages
...
Will validate against ISO 639-2 or ISO 639-3 depending on how long
the language field is. Otherwise will return that the language is
invalid.
Does not currently have any support for generic values like "Other".
2019-07-29 18:59:42 +03:00
e49b4e8f22
README.md: Try to simplify list of functionality
2019-07-29 18:25:38 +03:00
0eb852a65b
README.md: Improve note about unsafe options
2019-07-29 18:14:50 +03:00
8c34c2d6e6
README.md: Add note about removing duplicate values
2019-07-29 18:09:48 +03:00
1e444cf040
Add fix for duplicate metadata values
2019-07-29 18:05:03 +03:00
e33551776c
README.md: Update note about unsafe options
2019-07-29 17:25:42 +03:00
7f781d7077
README.md: Finish writing usage section
2019-07-29 17:21:34 +03:00
fa4fa3491b
Add check for "suspicious" characters
...
These standalone characters often indicate issues with encoding or
copy/paste in languages with accents like French and Spanish. For
example: foreˆt should be forêt.
It is not possible to fix these issues automatically, but this will
print a warning so you can notify the owner of the data.
2019-07-29 17:08:49 +03:00
ae66382046
README.md: Add note about unnecessary Unicode fixes
2019-07-29 16:34:39 +03:00
9ac9474c69
README.md: Add todo about duplicates
2019-07-29 12:40:53 +03:00
e000bd1f88
README.md: Add Travis CI badge
2019-07-29 12:19:10 +03:00
3554c2991f
README.md: Add note about Python version
2019-07-29 12:15:09 +03:00
ac127e7f8a
README.md: Add usage section
2019-07-29 11:30:06 +03:00
aabb57321c
README.md: Improve
...
Reorganize functionality section and add installation section.
2019-07-29 11:15:51 +03:00
a8a41d60b6
README.md: Add note about Pandas
2019-07-29 10:56:02 +03:00
cf6c01caaf
README.md: Add notes about unsafe fixes
2019-07-28 23:01:00 +03:00
4e6225c0a9
README.md: Add note about Excel files
2019-07-28 18:38:36 +03:00
d293214d3a
README.md: Add note about checking dates
2019-07-28 18:37:46 +03:00
4687e2f5fa
README.md: Remove todo for date validation
2019-07-28 17:26:39 +03:00
7e16968bf2
README.md: Add todos
2019-07-27 19:24:35 +03:00
0a751c1f25
README.md: Add SourceHut build badge
2019-07-26 23:59:31 +03:00
df1087b26f
README.md: Improve introduction, checks, and todo
2019-07-26 23:50:41 +03:00
64e7a73417
README.md: Add information about checks and fixes
2019-07-26 23:20:16 +03:00
b657c51fd2
Add initial README.md with intro, license, and todo
2019-07-26 22:18:38 +03:00