Update notes for 2018-08-16

This commit is contained in:
Alan Orth 2018-08-16 15:40:38 +03:00
parent 8a08555e8f
commit 23c81efa55
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
4 changed files with 41 additions and 16 deletions

View File

@ -52,11 +52,22 @@ tags: ["Notes"]
- Run through Peter's list of author affiliations from earlier this month
- I did some quick sanity checks and small cleanups in Open Refine, checking for spaces, weird accents, and encoding errors
- Finally I ran the [`fix-metadata-value.py`](https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897) script:
- Finally I did a test run with the [`fix-metadata-value.py`](https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897) script:
```
$ ./fix-metadata-values.py -i 2018-08-15-Correct-1083-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -t correct -m 211
$ ./delete-metadata-values.py -i 2018-08-15-Remove-11-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -m 211
```
## 2018-08-16
- Generate a list of the top 1,500 authors on CGSpace for Sisay so he can create the controlled vocabulary:
```
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc limit 1500) to /tmp/2018-08-16-top-1500-authors.csv with csv;
```
- Start working on adding the ORCID metadata to a handful of CIAT authors as requested by Elizabeth earlier this month
- I might need to overhaul the [add-orcid-identifiers-csv.py](https://gist.github.com/alanorth/a49d85cd9c5dea89cddbe809813a7050) script to be a little more robust about author order and ORCID metadata that might have been altered manually by editors after submission, as this script was written without that consideration
<!-- vim: set sw=2 ts=2: -->

View File

@ -34,7 +34,7 @@ I ran all system updates on DSpace Test and rebooted it
<meta property="article:published_time" content="2018-08-01T11:52:54&#43;03:00"/>
<meta property="article:modified_time" content="2018-08-02T14:29:59&#43;03:00"/>
<meta property="article:modified_time" content="2018-08-15T10:56:38&#43;01:00"/>
@ -79,9 +79,9 @@ I ran all system updates on DSpace Test and rebooted it
"@type": "BlogPosting",
"headline": "August, 2018",
"url": "https://alanorth.github.io/cgspace-notes/2018-08/",
"wordCount": "527",
"wordCount": "649",
"datePublished": "2018-08-01T11:52:54&#43;03:00",
"dateModified": "2018-08-02T14:29:59&#43;03:00",
"dateModified": "2018-08-15T10:56:38&#43;01:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -206,13 +206,27 @@ I ran all system updates on DSpace Test and rebooted it
<ul>
<li>Run through Peter&rsquo;s list of author affiliations from earlier this month</li>
<li>I did some quick sanity checks and small cleanups in Open Refine, checking for spaces, weird accents, and encoding errors</li>
<li>Finally I ran the <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897"><code>fix-metadata-value.py</code></a> script:</li>
<li>Finally I did a test run with the <a href="https://gist.github.com/alanorth/df92cbfb54d762ba21b28f7cd83b6897"><code>fix-metadata-value.py</code></a> script:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i 2018-08-15-Correct-1083-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -t correct -m 211
$ ./delete-metadata-values.py -i 2018-08-15-Remove-11-Affiliations.csv -db dspace -u dspace -p 'fuuu' -f cg.contributor.affiliation -m 211
</code></pre>
<h2 id="2018-08-16">2018-08-16</h2>
<ul>
<li>Generate a list of the top 1,500 authors on CGSpace for Sisay so he can create the controlled vocabulary:</li>
</ul>
<pre><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc limit 1500) to /tmp/2018-08-16-top-1500-authors.csv with csv;
</code></pre>
<ul>
<li>Start working on adding the ORCID metadata to a handful of CIAT authors as requested by Elizabeth earlier this month</li>
<li>I might need to overhaul the <a href="https://gist.github.com/alanorth/a49d85cd9c5dea89cddbe809813a7050">add-orcid-identifiers-csv.py</a> script to be a little more robust about author order and ORCID metadata that might have been altered manually by editors after submission, as this script was written without that consideration</li>
</ul>
<!-- vim: set sw=2 ts=2: -->

View File

@ -38,7 +38,7 @@ Disallow: /cgspace-notes/2015-12/
Disallow: /cgspace-notes/2015-11/
Disallow: /cgspace-notes/
Disallow: /cgspace-notes/categories/
Disallow: /cgspace-notes/tags/notes/
Disallow: /cgspace-notes/categories/notes/
Disallow: /cgspace-notes/tags/notes/
Disallow: /cgspace-notes/posts/
Disallow: /cgspace-notes/tags/

View File

@ -4,7 +4,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/2018-08/</loc>
<lastmod>2018-08-02T14:29:59+03:00</lastmod>
<lastmod>2018-08-15T10:56:38+01:00</lastmod>
</url>
<url>
@ -179,7 +179,7 @@
<url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2018-08-02T14:29:59+03:00</lastmod>
<lastmod>2018-08-15T10:56:38+01:00</lastmod>
<priority>0</priority>
</url>
@ -188,27 +188,27 @@
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2018-08-02T14:29:59+03:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2018-03-09T22:10:33+02:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2018-08-15T10:56:38+01:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2018-08-02T14:29:59+03:00</lastmod>
<lastmod>2018-08-15T10:56:38+01:00</lastmod>
<priority>0</priority>
</url>
<url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2018-08-02T14:29:59+03:00</lastmod>
<lastmod>2018-08-15T10:56:38+01:00</lastmod>
<priority>0</priority>
</url>