Update notes for 2018-12-02

This commit is contained in:
2018-12-02 17:55:32 +02:00
parent de150e2cf1
commit cad7ceaba1
3 changed files with 60 additions and 8 deletions

View File

@ -56,4 +56,28 @@ $ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAli
DEBUG: FC_WEIGHT didn't match
```
- Start proofing the latest round of 226 IITA archive records that Bosede sent last week and Sisay uploaded to DSpace Test this weekend ([IITA_Dec_1_1997 aka Daniel1807](https://dspacetest.cgiar.org/handle/10568/108298))
- One item missing the authorship type
- Some invalid countries (smart quotes, mispellings)
- Added countries to some items that mentioned research in particular countries in their abstracts
- One item had "MADAGASCAR" for ISI Journal
- Minor corrections in IITA subject (LIVELIHOOD→LIVELIHOODS)
- Trim whitespace in abstract field
- Fix some sponsors (though some with "Governments of Canada" etc I'm not sure why those are plural)
- Eighteen items had `en||fr` for the language, but the content was only in French so changed them to just `fr`
- Six items had encoding errors in French text so I will ask Bosede to re-do them carefully
- Correct and normalize a few AGROVOC subjects
- Expand my "encoding error" detection GREL to include `~` as I saw a lot of that in some copy pasted French text recently:
```
or(
isNotNull(value.match(/.*\uFFFD.*/)),
isNotNull(value.match(/.*\u00A0.*/)),
isNotNull(value.match(/.*\u200A.*/)),
isNotNull(value.match(/.*\u2019.*/)),
isNotNull(value.match(/.*\u00b4.*/)),
isNotNull(value.match(/.*\u007e.*/))
)
```
<!-- vim: set sw=2 ts=2: -->