From 6f44a3bcdda51d1f219cb08b03a16d7e37c57722 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Mon, 8 Apr 2019 11:26:20 +0300 Subject: [PATCH] Update notes for 2019-04-08 --- content/posts/2019-04.md | 27 ++++++++++++++++++++++++ docs/2019-04/index.html | 44 +++++++++++++++++++++++++++++++++++++--- docs/sitemap.xml | 10 ++++----- 3 files changed, 73 insertions(+), 8 deletions(-) diff --git a/content/posts/2019-04.md b/content/posts/2019-04.md index a16cca047..029fe758d 100644 --- a/content/posts/2019-04.md +++ b/content/posts/2019-04.md @@ -401,4 +401,31 @@ $ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|ds - It seems that the issue with CGSpace being "down" is actually because of CPU steal again!!! - I opened a ticket with support and asked them to migrate the VM to a less busy host +## 2019-04-08 + +- Start checking IITA's last round of batch uploads from [March on DSpace Test](https://dspacetest.cgiar.org/handle/10568/100333) (20193rd.xls) + - Lots of problems with affiliations, I had to correct about sixty of them + - I used lein to host the latest CSV of our affiliations for OpenRefine to reconcile against: + +``` +$ lein run ~/src/git/DSpace/2019-02-22-affiliations.csv name id +``` + +- After matching the values and creating some new matches I had trouble remembering how to copy the reconciled values to a new column + - The matched values can be accessed with `cell.recon.match.name`, but some of the new values don't appear, perhaps because I edited the original cell values? + - I ended up using this GREL expression to copy all values to a new column: + +``` +if(cell.recon.matched, cell.recon.match.name, value) +``` + +- See the [OpenRefine variables documentation](https://github.com/OpenRefine/OpenRefine/wiki/Variables#recon) for more notes about the `recon` object +- I also noticed a handful of errors in our current list of affiliations so I corrected them: + +``` +$ ./fix-metadata-values.py -i 2019-04-08-fix-13-affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -m 211 -t correct -d +``` + +- We should create a new list of affiliations to update our controlled vocabulary again + diff --git a/docs/2019-04/index.html b/docs/2019-04/index.html index 815adef0a..e0d720935 100644 --- a/docs/2019-04/index.html +++ b/docs/2019-04/index.html @@ -38,7 +38,7 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace - + @@ -81,9 +81,9 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace "@type": "BlogPosting", "headline": "April, 2019", "url": "https://alanorth.github.io/cgspace-notes/2019-04/", - "wordCount": "2222", + "wordCount": "2397", "datePublished": "2019-04-01T09:00:43+03:00", - "dateModified": "2019-04-07T21:15:03+03:00", + "dateModified": "2019-04-07T21:17:16+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -623,6 +623,44 @@ X-XSS-Protection: 1; mode=block
  • I opened a ticket with support and asked them to migrate the VM to a less busy host
  • +

    2019-04-08

    + + + +
    $ lein run ~/src/git/DSpace/2019-02-22-affiliations.csv name id
    +
    + + + +
    if(cell.recon.matched, cell.recon.match.name, value)
    +
    + + + +
    $ ./fix-metadata-values.py -i 2019-04-08-fix-13-affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -m 211 -t correct -d
    +
    + + + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index e9ee80f29..571531f8e 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2019-04/ - 2019-04-07T21:15:03+03:00 + 2019-04-07T21:17:16+03:00 @@ -219,7 +219,7 @@ https://alanorth.github.io/cgspace-notes/ - 2019-04-07T21:15:03+03:00 + 2019-04-07T21:17:16+03:00 0 @@ -230,7 +230,7 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2019-04-07T21:15:03+03:00 + 2019-04-07T21:17:16+03:00 0 @@ -242,13 +242,13 @@ https://alanorth.github.io/cgspace-notes/posts/ - 2019-04-07T21:15:03+03:00 + 2019-04-07T21:17:16+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2019-04-07T21:15:03+03:00 + 2019-04-07T21:17:16+03:00 0