Add notes for 2022-08-18

This commit is contained in:
2022-08-18 13:45:48 -07:00
parent 6c61d1c102
commit e203ee6dcc
29 changed files with 85 additions and 34 deletions

View File

@ -96,4 +96,26 @@ $ dspace import --add --eperson=aorth@mjanja.ch --source /tmp/SimpleArchiveForma
- Add CONSERVATION to ILRI subjects on CGSpace
- I see that AGROVOC has `conservation agriculture` and I suggested that we use that instead
## 2022-08-17
- Peter and Jose sent more feedback about the CRP Innovation records from MARLO
- We expanded the CRP names in the citation and removed the `cg.identifier.url` URLs because they are ugly and will stop working eventually
- The mappings of MARLO links will be done internally with the `cg.number` IDs like "IN-1119" and the Handle URIs
## 2022-08-18
- I talked to Jose about the CCAFS MARLO records
- He still hasn't finished re-processing the PDFs to update the internal MARLO links
- I started looking at the other records (MELIAs, OICRs, Policies) and found some minor issues in the MELIAs so I sent feedback to Jose
- On second thought, I opened the MELIAs file in OpenRefine and it looks OK, so this must have been a parsing issue in LibreOffice when I was checking the file (or perhaps I didn't use the correct quoting when importing)
- Import the original MELIA v2 CSV file into OpenRefine to fix encoding before processing with csvcut/csvjoin
- Then extract the IDs and filenames from the original V2 file and join with the UTF-8 file:
```console
$ csvcut -c 'cg.number (series/report No.)',File ~/Downloads/MELIA-Metadata-v2-csv.csv > MELIA-v2-IDs-Files.csv
$ csvjoin -c 'cg.number (series/report No.)' MELIAs\ metadata\ utf8\ 20220816_JM.csv MELIA-v2-IDs-Files.csv > MELIAs-UTF-8-with-files.csv
```
- Then I imported them into OpenRefine to start metadata cleaning and enrichment
<!-- vim: set sw=2 ts=2: -->