mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2019-09-27
This commit is contained in:
@ -40,7 +40,7 @@ Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-09/" />
|
||||
<meta property="article:published_time" content="2019-09-01T10:17:51+03:00" />
|
||||
<meta property="article:modified_time" content="2019-09-26T14:21:41+03:00" />
|
||||
<meta property="article:modified_time" content="2019-09-27T01:20:09+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="September, 2019"/>
|
||||
@ -85,9 +85,9 @@ Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
|
||||
"@type": "BlogPosting",
|
||||
"headline": "September, 2019",
|
||||
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-09\/",
|
||||
"wordCount": "2497",
|
||||
"wordCount": "2870",
|
||||
"datePublished": "2019-09-01T10:17:51\x2b03:00",
|
||||
"dateModified": "2019-09-26T14:21:41\x2b03:00",
|
||||
"dateModified": "2019-09-27T01:20:09\x2b03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -561,6 +561,56 @@ $ dspace import -a me@cgiar.org -m 2019-09-20-bioversity2.map -s /home/aorth/Bio
|
||||
<li>I told her to delete one item that appears to be a duplicate, or to fix its citation to be correct if she thinks it is not a duplicate</li>
|
||||
<li>I deleted another item that I had previously identified as a duplicate that she had fixed by incorrectly deleting the original (ugh)</li>
|
||||
</ul></li>
|
||||
|
||||
<li><p>Get a list of institutions from CCAFS’s Clarisa API and try to parse it with <code>jq</code>, do some small cleanups and add a header in <code>sed</code>, and then pass it through <code>csvcut</code> to add line numbers:</p>
|
||||
|
||||
<pre><code>$ cat ~/Downloads/institutions.json| jq '.[] | {name: .name}' | grep name | awk -F: '{print $2}' | sed -e 's/"//g' -e 's/^ //' -e '1iname' | csvcut -l | sed '1s/line_number/id/' > /tmp/clarisa-institutions.csv
|
||||
$ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institutions-cleaned.csv -u
|
||||
</code></pre></li>
|
||||
|
||||
<li><p>The csv-metadata-quality tool caught a few records with excessive spacing and unnecessary Unicode</p></li>
|
||||
|
||||
<li><p>I could potentially use this with reconcile-csv and OpenRefine as a source to validate our institutional authors against…</p></li>
|
||||
</ul>
|
||||
|
||||
<h2 id="2019-09-27">2019-09-27</h2>
|
||||
|
||||
<ul>
|
||||
<li>Skype with Peter and Abenet about CGSpace actions
|
||||
|
||||
<ul>
|
||||
<li>Peter will respond to ICARDA’s request to deposit items in to CGSpace, with a caveat that we agree on some vocabulary standards for institutions, countries, regions, etc</li>
|
||||
<li>We discussed using ISO 3166 for countries, though Peter doesn’t like the formal names like “Moldova, Republic of” and “Tanzania, United Republic of”</li>
|
||||
<li>The Debian <code>iso-codes</code> package has ISO 3166-1 with “common name”, “name”, and “official name” representations, for example:
|
||||
|
||||
<ul>
|
||||
<li>common_name: Tanzania</li>
|
||||
<li>name: Tanzania, United Republic of</li>
|
||||
<li>official_name: United Republic of Tanzania</li>
|
||||
</ul></li>
|
||||
<li>There are still some unfortunate ones there, though:
|
||||
|
||||
<ul>
|
||||
<li>name: Korea, Democratic People’s Republic of</li>
|
||||
<li>official_name: Democratic People’s Republic of Korea</li>
|
||||
</ul></li>
|
||||
<li>And this, which isn’t even in English…
|
||||
|
||||
<ul>
|
||||
<li>name: Côte d’Ivoire</li>
|
||||
<li>official_name: Republic of Côte d’Ivoire</li>
|
||||
</ul></li>
|
||||
<li>The other alternative is to just keep using the names we have, which are mostly compliant with AGROVOC</li>
|
||||
<li>Peter said that a new server for DSpace Test is fine, so I can proceed with the normal process of getting approval from Michael Victor and ICT when I have time (recommend moving from $40 to $80/month Linode, with 16GB RAM)</li>
|
||||
<li>I need to ask Atmire for a quote to upgrade CGSpace to DSpace 6 with all current modules so we can see how many more credits we need</li>
|
||||
</ul></li>
|
||||
<li>A little bit more work on the Sept 6 IITA batch records
|
||||
|
||||
<ul>
|
||||
<li>Bosede deleted the one item that I told her was a duplicate</li>
|
||||
<li>I checked the AGROVOC subjects and fixed one incorrect one</li>
|
||||
<li>Then I told her that I think the items are ready to go to CGSpace and asked Abenet for a final comment</li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user