Add notes for 2020-01-27

This commit is contained in:
2020-01-27 16:20:44 +02:00
parent 207ace0883
commit 8feb93be39
112 changed files with 11466 additions and 5158 deletions

View File

@ -31,7 +31,7 @@ Run system updates on CGSpace (linode18) and reboot it
Skype with Marie-Angélique and Abenet about CG Core v2
"/>
<meta name="generator" content="Hugo 0.62.2" />
<meta name="generator" content="Hugo 0.63.1" />
@ -61,7 +61,7 @@ Skype with Marie-Angélique and Abenet about CG Core v2
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy&#43;piAwENoVPTw=" crossorigin="anonymous">
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I&#43;LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
<!-- RSS 2.0 feed -->
@ -108,7 +108,7 @@ Skype with Marie-Angélique and Abenet about CG Core v2
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-06-02T10:57:51&#43;03:00">Sun Jun 02, 2019</time> by Alan Orth in
<i class="fa fa-folder" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
</p>
@ -172,16 +172,16 @@ Skype with Marie-Angélique and Abenet about CG Core v2
<li>Create a new AReS repository: <a href="https://github.com/ilri/AReS">https://github.com/ilri/AReS</a></li>
<li>Start looking at the 203 IITA records on DSpace Test from last month (<a href="https://dspacetest.cgiar.org/handle/10568/102032">IITA_May_16</a> aka &ldquo;20194th.xls&rdquo;) using OpenRefine
<ul>
<li>Trim leading, trailing, and consecutive whitespace on all columns, but I didn't notice very many issues</li>
<li>Trim leading, trailing, and consecutive whitespace on all columns, but I didn&rsquo;t notice very many issues</li>
<li>Validate affiliations against latest list of top 1500 terms using reconcile-csv, correcting and standardizing about twenty-seven</li>
<li>Validate countries against latest list of countries using reconcile-csv, correcting three</li>
<li>Convert all DOIs to &ldquo;<a href="https://dx.doi.org%22">https://dx.doi.org&quot;</a> format</li>
<li>Convert all DOIs to &ldquo;<a href="https://dx.doi.org">https://dx.doi.org</a>&rdquo; format</li>
<li>Normalize all <code>cg.identifier.url</code> Google book fields to &ldquo;books.google.com&rdquo;</li>
<li>Correct some inconsistencies in IITA subjects</li>
<li>Correct two incorrect &ldquo;Peer Review&rdquo; in <code>dc.description.version</code></li>
<li>About fifteen items have incorrect ISBNs (looks like an Excel error because the values look like scientific numbers)</li>
<li>Delete one blank item</li>
<li>I managed to get to subjects, so I'll continue from there when I start working next</li>
<li>I managed to get to subjects, so I&rsquo;ll continue from there when I start working next</li>
</ul>
</li>
<li>Generate a new list of countries from the database for use with reconcile-csv
@ -194,7 +194,7 @@ Skype with Marie-Angélique and Abenet about CG Core v2
COPY 192
$ csvcut -l -c 0 /tmp/countries.csv &gt; 2019-06-10-countries.csv
</code></pre><ul>
<li>Get a list of all the unique AGROVOC subject terms in IITA's data and export it to a text file so I can validate them with my <code>agrovoc-lookup.py</code> script:</li>
<li>Get a list of all the unique AGROVOC subject terms in IITA&rsquo;s data and export it to a text file so I can validate them with my <code>agrovoc-lookup.py</code> script:</li>
</ul>
<pre><code>$ csvcut -c dc.subject ~/Downloads/2019-06-10-IITA-20194th-Round-2.csv| sed 's/||/\n/g' | grep -v dc.subject | sort -u &gt; iita-agrovoc.txt
$ ./agrovoc-lookup.py -i iita-agrovoc.txt -om iita-agrovoc-matches.txt -or iita-agrovoc-rejects.txt
@ -251,9 +251,9 @@ UPDATE 2
<li>Lots of variation in affiliations, for example:
<ul>
<li>Université Abomey-Calavi</li>
<li>Université d'Abomey</li>
<li>Université d'Abomey Calavi</li>
<li>Université d'Abomey-Calavi</li>
<li>Université d&rsquo;Abomey</li>
<li>Université d&rsquo;Abomey Calavi</li>
<li>Université d&rsquo;Abomey-Calavi</li>
<li>University of Abomey-Calavi</li>
</ul>
</li>