diff --git a/content/posts/2018-08.md b/content/posts/2018-08.md index 559c3c37b..213404083 100644 --- a/content/posts/2018-08.md +++ b/content/posts/2018-08.md @@ -52,11 +52,22 @@ tags: ["Notes"] - Run through Peter's list of author affiliations from earlier this month - I did some quick sanity checks and small cleanups in Open Refine, checking for spaces, weird accents, and encoding errors -- Finally I ran the [`fix-metadata-value.py`](https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897) script: +- Finally I did a test run with the [`fix-metadata-value.py`](https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897) script: ``` $ ./fix-metadata-values.py -i 2018-08-15-Correct-1083-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -t correct -m 211 $ ./delete-metadata-values.py -i 2018-08-15-Remove-11-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -m 211 ``` +## 2018-08-16 + +- Generate a list of the top 1,500 authors on CGSpace for Sisay so he can create the controlled vocabulary: + +``` +dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc limit 1500) to /tmp/2018-08-16-top-1500-authors.csv with csv; +``` + +- Start working on adding the ORCID metadata to a handful of CIAT authors as requested by Elizabeth earlier this month +- I might need to overhaul the [add-orcid-identifiers-csv.py](https://gist.github.com/alanorth/a49d85cd9c5dea89cddbe809813a7050) script to be a little more robust about author order and ORCID metadata that might have been altered manually by editors after submission, as this script was written without that consideration + diff --git a/docs/2018-08/index.html b/docs/2018-08/index.html index b6675f305..aa1715128 100644 --- a/docs/2018-08/index.html +++ b/docs/2018-08/index.html @@ -34,7 +34,7 @@ I ran all system updates on DSpace Test and rebooted it - + @@ -79,9 +79,9 @@ I ran all system updates on DSpace Test and rebooted it "@type": "BlogPosting", "headline": "August, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-08/", - "wordCount": "527", + "wordCount": "649", "datePublished": "2018-08-01T11:52:54+03:00", - "dateModified": "2018-08-02T14:29:59+03:00", + "dateModified": "2018-08-15T10:56:38+01:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -206,13 +206,27 @@ I ran all system updates on DSpace Test and rebooted it
fix-metadata-value.py
script:fix-metadata-value.py
script:$ ./fix-metadata-values.py -i 2018-08-15-Correct-1083-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -t correct -m 211
$ ./delete-metadata-values.py -i 2018-08-15-Remove-11-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -m 211
+dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc limit 1500) to /tmp/2018-08-16-top-1500-authors.csv with csv;
+
+
+