diff --git a/content/posts/2019-10.md b/content/posts/2019-10.md index 82c0b0179..ebd76847f 100644 --- a/content/posts/2019-10.md +++ b/content/posts/2019-10.md @@ -125,7 +125,11 @@ International Maize and Wheat Improvement Centre,International Maize and Wheat I $ ./fix-metadata-values.py -i /tmp/affiliations.csv -db dspace -u dspace -p 'fuuu' -f from -m 211 -t to ``` -- I did some manual curation of ~227 authors in preparation for telling Peter and Abenet that the migration is almost ready +- I did some manual curation of about 300 authors in OpenRefine in preparation for telling Peter and Abenet that the migration is almost ready - I would still like to perhaps (re)move institutional authors from `dc.contributor.author` to `cg.contributor.affiliation`, but I will have to run that by Francesca, Carol, and Abenet + - I could use a custom text facet like this in OpenRefine to find authors that likely match the "Last, F." pattern: `isNotNull(value.match(/^.*, \p{Lu}\.?.*$/))` + - The `\p{Lu}` is a cool [regex character class](https://www.regular-expressions.info/unicode.html) to make sure this works for letters with accents + - As cool as that is, it's actually more effective to just search for authors that have "." in them! + - I've decided to add a `cg.contributor.affiliation` column to 1,025 items based on the logic above where the author name is not an actual person diff --git a/docs/2019-10/index.html b/docs/2019-10/index.html index 0cc244de7..f40202051 100644 --- a/docs/2019-10/index.html +++ b/docs/2019-10/index.html @@ -11,7 +11,7 @@ - + @@ -27,9 +27,9 @@ "@type": "BlogPosting", "headline": "October, 2019", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-10\/", - "wordCount": "965", + "wordCount": "1051", "datePublished": "2019-10-01T13:20:51+03:00", - "dateModified": "2019-10-12T14:28:43+03:00", + "dateModified": "2019-10-12T19:21:30+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -267,10 +267,14 @@ International Maize and Wheat Improvement Centre,International Maize and Wheat I
$ ./fix-metadata-values.py -i /tmp/affiliations.csv -db dspace -u dspace -p 'fuuu' -f from -m 211 -t to
-I did some manual curation of ~227 authors in preparation for telling Peter and Abenet that the migration is almost ready
+I did some manual curation of about 300 authors in OpenRefine in preparation for telling Peter and Abenet that the migration is almost ready
dc.contributor.author
to cg.contributor.affiliation
, but I will have to run that by Francesca, Carol, and AbenetisNotNull(value.match(/^.*, \p{Lu}\.?.*$/))
\p{Lu}
is a cool regex character class to make sure this works for letters with accentscg.contributor.affiliation
column to 1,025 items based on the logic above where the author name is not an actual person