Add notes for 2019-06-28

This commit is contained in:
Alan Orth 2019-06-29 01:06:53 +03:00
parent 4b9db911ab
commit 389921d80e
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
2 changed files with 27 additions and 6 deletions

View File

@ -132,4 +132,25 @@ UPDATE 2
- Upload 202 IITA records from earlier this month (20194th.xls) to CGSpace - Upload 202 IITA records from earlier this month (20194th.xls) to CGSpace
- Communicate with Bioversity contractor in charge of their migration from Typo3 to CGSpace - Communicate with Bioversity contractor in charge of their migration from Typo3 to CGSpace
<!-- vim: set sw=2 ts=2: --> ## 2019-06-28
- Start looking at the fifty-seven AfricaRice records sent by Ibnou earlier this month
- First, I see there are several items with type "Book" and "Book Chapter" should go in an "AfricaRice books and book chapters" collection, but none exists in the AfricaRice community
- Trim and collapse consecutive whitespace on author, affiliation, authorship types, title, subjects, doi, issn, source, citation, country, sponsors
- Standardize and correct affiliations like "Africa Rice Cente" and "Africa Rice Centre", including syntax errors with multi-value separators
- Lots of variation in affiliations, for example:
- Université Abomey-Calavi
- Université d'Abomey
- Université d'Abomey Calavi
- Université d'Abomey-Calavi
- University of Abomey-Calavi
- Validate and normalize affiliations against our 2019-04 list using reconcile-csv and OpenRefine:
- `$ lein run ~/src/git/DSpace/2019-04-08-affiliations.csv name id`
- I always forget how to copy the reconciled values in OpenRefine, but you need to make a new colume and populate it using this GREL: `if(cell.recon.matched, cell.recon.match.name, value)`
- Replace smart quotes with standard ASCII ones
- Fix typos in authoriship types
- Validate and normalize subjects against our 2019-06 list using reconcile-csv and OpenRefine:
- `$ lein run ~/src/git/DSpace/2019-06-10-subjects-matched.csv name id`
- Also add about 30 new AGROVOC subjects to our list that I verified manually
- There is one duplicate, both have the same DOI: https://doi.org/10.1016/j.agwat.2018.06.018
- Fix four ISBNs that were in the ISSN field

View File

@ -4,30 +4,30 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2019-06-25T20:10:57+03:00</lastmod> <lastmod>2019-06-25T21:00:27+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2019-05/</loc> <loc>https://alanorth.github.io/cgspace-notes/2019-05/</loc>
<lastmod>2019-06-25T20:10:57+03:00</lastmod> <lastmod>2019-06-25T21:00:27+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2019-06-25T20:10:57+03:00</lastmod> <lastmod>2019-06-25T21:00:27+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc> <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2019-06-25T20:10:57+03:00</lastmod> <lastmod>2019-06-25T21:00:27+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2019-06-25T20:10:57+03:00</lastmod> <lastmod>2019-06-25T21:00:27+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>