Add notes for 2020-11-30

This commit is contained in:
2020-11-30 20:12:55 +02:00
parent f89a296d67
commit 8cbcdd15e9
28 changed files with 104 additions and 40 deletions

View File

@ -17,7 +17,7 @@ So far we’ve spent at least fifty hours to process the statistics and stat
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-11/" />
<meta property="article:published_time" content="2020-11-01T13:11:54+02:00" />
<meta property="article:modified_time" content="2020-11-29T08:40:58+02:00" />
<meta property="article:modified_time" content="2020-11-29T14:50:02+02:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="November, 2020"/>
@ -39,9 +39,9 @@ So far we&rsquo;ve spent at least fifty hours to process the statistics and stat
"@type": "BlogPosting",
"headline": "November, 2020",
"url": "https://alanorth.github.io/cgspace-notes/2020-11/",
"wordCount": "3438",
"wordCount": "3655",
"datePublished": "2020-11-01T13:11:54+02:00",
"dateModified": "2020-11-29T08:40:58+02:00",
"dateModified": "2020-11-29T14:50:02+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -677,7 +677,37 @@ COPY 87411
</ul>
</li>
</ul>
<!-- raw HTML omitted -->
<h2 id="2020-11-30">2020-11-30</h2>
<ul>
<li>Ben Hack asked for the ILRI subject we are using on CGSpace
<ul>
<li>I linked him the input-forms.xml file and also sent him a list of 112 terms extracted with <code>xml</code> from the xmlstarlet package:</li>
</ul>
</li>
</ul>
<pre><code>$ xml sel -t -m '//value-pairs[@value-pairs-name=&quot;ilrisubject&quot;]/pair/displayed-value/text()' -c '.' -n dspace/config/input-forms.xml
</code></pre><ul>
<li>IWMI sent me a few new ORCID identifiers so I combined them with our existing ones as well as another ILRI one that Tezira asked me to update, filtered the unique ones, and then resolved their names using my <code>resolve-orcids.py</code> script:</li>
</ul>
<pre><code>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/iwmi-orcids.txt /tmp/hung.txt | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq &gt; /tmp/2020-11-30-combined-orcids.txt
$ ./resolve-orcids.py -i /tmp/2020-11-30-combined-orcids.txt -o /tmp/2020-11-30-combined-orcids-names.txt -d
# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
</code></pre><ul>
<li>I used my <code>fix-metadata-values.py</code> script to update the old occurences of Hung&rsquo;s ORCID and some others that I see have changed:</li>
</ul>
<pre><code>$ cat 2020-11-30-fix-hung-orcid.csv
cg.creator.id,correct
&quot;Hung Nguyen-Viet: 0000-0001-9877-0596&quot;,&quot;Hung Nguyen-Viet: 0000-0003-1549-2733&quot;
&quot;Adriana Tofiño: 0000-0001-7115-7169&quot;,&quot;Adriana Tofiño Rivera: 0000-0001-7115-7169&quot;
&quot;Cristhian Puerta Rodriguez: 0000-0001-5992-1697&quot;,&quot;David Puerta: 0000-0001-5992-1697&quot;
&quot;Ermias Betemariam: 0000-0002-1955-6995&quot;,&quot;Ermias Aynekulu: 0000-0002-1955-6995&quot;
&quot;Hirut Betaw: 0000-0002-1205-3711&quot;,&quot;Betaw Hirut: 0000-0002-1205-3711&quot;
&quot;Megan Zandstra: 0000-0002-3326-6492&quot;,&quot;Megan McNeil Zandstra: 0000-0002-3326-6492&quot;
&quot;Tolu Eyinla: 0000-0003-1442-4392&quot;,&quot;Toluwalope Emmanuel: 0000-0003-1442-4392&quot;
&quot;VInay Nangia: 0000-0001-5148-8614&quot;,&quot;Vinay Nangia: 0000-0001-5148-8614&quot;
$ ./fix-metadata-values.py -i 2020-11-30-fix-hung-orcid.csv -db dspace63 -u dspacetest -p 'dom@in34sniper' -f cg.creator.id -t 'correct' -m 240
</code></pre><!-- raw HTML omitted -->