Add notes for 2020-01-27

This commit is contained in:
2020-01-27 16:20:44 +02:00
parent 207ace0883
commit 8feb93be39
112 changed files with 11466 additions and 5158 deletions

View File

@ -8,7 +8,7 @@
<meta property="og:title" content="August, 2019" />
<meta property="og:description" content="2019-08-03
Look at Bioversity&#39;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;
Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;
2019-08-04
@ -16,7 +16,7 @@ Deploy ORCID identifier updates requested by Bioversity to CGSpace
Run system updates on CGSpace (linode18) and reboot it
Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;
After rebooting, all statistics cores were loaded&hellip; wow, that&#39;s lucky.
After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.
Run system updates on DSpace Test (linode19) and reboot it
@ -30,7 +30,7 @@ Run system updates on DSpace Test (linode19) and reboot it
<meta name="twitter:title" content="August, 2019"/>
<meta name="twitter:description" content="2019-08-03
Look at Bioversity&#39;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;
Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;
2019-08-04
@ -38,12 +38,12 @@ Deploy ORCID identifier updates requested by Bioversity to CGSpace
Run system updates on CGSpace (linode18) and reboot it
Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;
After rebooting, all statistics cores were loaded&hellip; wow, that&#39;s lucky.
After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.
Run system updates on DSpace Test (linode19) and reboot it
"/>
<meta name="generator" content="Hugo 0.62.2" />
<meta name="generator" content="Hugo 0.63.1" />
@ -73,7 +73,7 @@ Run system updates on DSpace Test (linode19) and reboot it
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy&#43;piAwENoVPTw=" crossorigin="anonymous">
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I&#43;LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
<!-- RSS 2.0 feed -->
@ -120,14 +120,14 @@ Run system updates on DSpace Test (linode19) and reboot it
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-08/">August, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-08-03T12:39:51&#43;03:00">Sat Aug 03, 2019</time> by Alan Orth in
<i class="fa fa-folder" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-08-03">2019-08-03</h2>
<ul>
<li>Look at Bioversity's latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;</li>
<li>Look at Bioversity&rsquo;s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name&hellip;</li>
</ul>
<h2 id="2019-08-04">2019-08-04</h2>
<ul>
@ -135,7 +135,7 @@ Run system updates on DSpace Test (linode19) and reboot it
<li>Run system updates on CGSpace (linode18) and reboot it
<ul>
<li>Before updating it I checked Solr and verified that all statistics cores were loaded properly&hellip;</li>
<li>After rebooting, all statistics cores were loaded&hellip; wow, that's lucky.</li>
<li>After rebooting, all statistics cores were loaded&hellip; wow, that&rsquo;s lucky.</li>
</ul>
</li>
<li>Run system updates on DSpace Test (linode19) and reboot it</li>
@ -199,7 +199,7 @@ Run system updates on DSpace Test (linode19) and reboot it
isNotNull(value.match(/^.*û.*$/))
).toString()
</code></pre><ul>
<li>I tried to extract the filenames and construct a URL to download the PDFs with my <code>generate-thumbnails.py</code> script, but there seem to be several paths for PDFs so I can't guess it properly</li>
<li>I tried to extract the filenames and construct a URL to download the PDFs with my <code>generate-thumbnails.py</code> script, but there seem to be several paths for PDFs so I can&rsquo;t guess it properly</li>
<li>I will have to wait for Francesco to respond about the PDFs, or perhaps proceed with a metadata-only upload so we can do other checks on DSpace Test</li>
</ul>
<h2 id="2019-08-06">2019-08-06</h2>
@ -231,7 +231,7 @@ Run system updates on DSpace Test (linode19) and reboot it
<pre><code># /opt/certbot-auto renew --standalone --pre-hook &quot;/usr/bin/docker stop angular_nginx; /bin/systemctl stop firewalld&quot; --post-hook &quot;/bin/systemctl start firewalld; /usr/bin/docker start angular_nginx&quot;
</code></pre><ul>
<li>It is important that the firewall starts back up before the Docker container or else Docker will complain about missing iptables chains</li>
<li>Also, I updated to the latest TLS Intermediate settings as appropriate for Ubuntu 18.04's <a href="https://ssl-config.mozilla.org/#server=nginx&amp;server-version=1.16.0&amp;config=intermediate&amp;openssl-version=1.1.0g&amp;hsts=false&amp;ocsp=false">OpenSSL 1.1.0g with nginx 1.16.0</a></li>
<li>Also, I updated to the latest TLS Intermediate settings as appropriate for Ubuntu 18.04&rsquo;s <a href="https://ssl-config.mozilla.org/#server=nginx&amp;server-version=1.16.0&amp;config=intermediate&amp;openssl-version=1.1.0g&amp;hsts=false&amp;ocsp=false">OpenSSL 1.1.0g with nginx 1.16.0</a></li>
<li>Run all system updates on AReS dev server (linode20) and reboot it</li>
<li>Get a list of all PDFs from the Bioversity migration that fail to download and save them so I can try again with a different path in the URL:</li>
</ul>
@ -253,7 +253,7 @@ $ ./generate-thumbnails.py -i /tmp/user-upload2.csv -w --url-field-name url -d |
</ul>
</li>
<li>
<p>Even so, there are still 52 items with incorrect filenames, so I can't derive their PDF URLs&hellip;</p>
<p>Even so, there are still 52 items with incorrect filenames, so I can&rsquo;t derive their PDF URLs&hellip;</p>
<ul>
<li>For example, <code>Wild_cherry_Prunus_avium_859.pdf</code> is here (with double underscore): <a href="https://www.bioversityinternational.org/fileadmin/_migrated/uploads/tx_news/Wild_cherry__Prunus_avium__859.pdf">https://www.bioversityinternational.org/fileadmin/_migrated/uploads/tx_news/Wild_cherry__Prunus_avium__859.pdf</a></li>
</ul>
@ -348,7 +348,7 @@ $ ~/dspace/bin/dspace metadata-import -f /tmp/bioversity.csv -e blah@blah.com
<ul>
<li>I imported the 1,427 Bioversity records into DSpace Test
<ul>
<li>To make sure we didn't have memory issues I reduced Tomcat's JVM heap by 512m, increased the import processes's heap to 512m, and split the input file into two parts with about 700 each</li>
<li>To make sure we didn&rsquo;t have memory issues I reduced Tomcat&rsquo;s JVM heap by 512m, increased the import processes&rsquo;s heap to 512m, and split the input file into two parts with about 700 each</li>
<li>Then I had to create a few new temporary collections on DSpace Test that had been created on CGSpace after our last sync</li>
<li>After that the import succeeded:</li>
</ul>
@ -395,8 +395,8 @@ return os.path.basename(value)
</ul>
<h2 id="2019-08-21">2019-08-21</h2>
<ul>
<li>Upload <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality repository to ILRI's GitHub organization</a></li>
<li>Fix a few invalid countries in IITA's <a href="https://dspacetest.cgiar.org/handle/10568/102361">July 29</a> records (aka &ldquo;20195TH.xls&rdquo;)
<li>Upload <a href="https://github.com/ilri/csv-metadata-quality">csv-metadata-quality repository to ILRI&rsquo;s GitHub organization</a></li>
<li>Fix a few invalid countries in IITA&rsquo;s <a href="https://dspacetest.cgiar.org/handle/10568/102361">July 29</a> records (aka &ldquo;20195TH.xls&rdquo;)
<ul>
<li>These were not caught by my csv-metadata-quality check script because of a logic error</li>
<li>Remove <code>dc.identified.uri</code> fields from test data, set <code>id</code> values to &ldquo;-1&rdquo;, add collection mappings according to <code>dc.type</code>, and Upload 126 IITA records to CGSpace</li>
@ -439,13 +439,13 @@ sys 2m24.715s
<li>
<p>Peter asked me to add related citation aka <code>cg.link.citation</code> to the item view</p>
<ul>
<li>I created a <a href="https://github.com/ilri/DSpace/pull/430">pull request</a> with a draft implementation and asked for Peter's feedback</li>
<li>I created a <a href="https://github.com/ilri/DSpace/pull/430">pull request</a> with a draft implementation and asked for Peter&rsquo;s feedback</li>
</ul>
</li>
<li>
<p>Add the ability to skip certain fields from the csv-metadata-quality script using <code>--exclude-fields</code></p>
<ul>
<li>For example, when I'm working on the author corrections I want to do the basic checks on the corrected fields, but on the original fields so I would use <code>--exclude-fields dc.contributor.author</code> for example</li>
<li>For example, when I&rsquo;m working on the author corrections I want to do the basic checks on the corrected fields, but on the original fields so I would use <code>--exclude-fields dc.contributor.author</code> for example</li>
</ul>
</li>
</ul>
@ -493,7 +493,7 @@ COPY 65597
<ul>
<li>Resume working on the CG Core v2 changes in the <code>5_x-cgcorev2</code> branch again
<ul>
<li>I notice that CG Core doesn't currently have a field for CGSpace's &ldquo;alternative title&rdquo; (<code>dc.title.alternative</code>), but DCTERMS has <code>dcterms.alternative</code> so I <a href="https://github.com/AgriculturalSemantics/cg-core/issues/9">raised an issue about adding it</a></li>
<li>I notice that CG Core doesn&rsquo;t currently have a field for CGSpace&rsquo;s &ldquo;alternative title&rdquo; (<code>dc.title.alternative</code>), but DCTERMS has <code>dcterms.alternative</code> so I <a href="https://github.com/AgriculturalSemantics/cg-core/issues/9">raised an issue about adding it</a></li>
<li>Marie responded and said she would add <code>dcterms.alternative</code></li>
<li>I created a sed script file to perform some replacements of metadata on the XMLUI XSL files:</li>
</ul>
@ -521,7 +521,7 @@ COPY 65597
</ul>
<pre><code>&quot;handles&quot;:[&quot;10986/30568&quot;,&quot;10568/97825&quot;],&quot;handle&quot;:&quot;10986/30568&quot;
</code></pre><ul>
<li>So this is the same issue we had before, where Altmetric <em>knows</em> this Handle is associated with a DOI that has a score, but the client-side JavaScript code doesn't show it because it seems to a secondary handle or something</li>
<li>So this is the same issue we had before, where Altmetric <em>knows</em> this Handle is associated with a DOI that has a score, but the client-side JavaScript code doesn&rsquo;t show it because it seems to a secondary handle or something</li>
</ul>
<h2 id="2019-08-31">2019-08-31</h2>
<ul>