Add notes for 2020-11-02

This commit is contained in:
2020-11-02 19:34:10 +02:00
parent 286fd3fd3c
commit 2cc9c1f1e3
93 changed files with 926 additions and 797 deletions

View File

@ -9,12 +9,12 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-10-30T15:24:23+02:00" />
<meta property="og:updated_time" content="2020-11-01T13:11:54+02:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/>
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
<meta name="generator" content="Hugo 0.76.5" />
<meta name="generator" content="Hugo 0.77.0" />
@ -80,6 +80,43 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-01-06">2020-01-06</h2>
<ul>
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
<ul>
<li>The score is now linked to the DOI</li>
<li>Another <a href="https://handle.hdl.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
</ul>
</li>
</ul>
<h2 id="2020-01-07">2020-01-07</h2>
<ul>
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
<ul>
<li>The DOI has a score of 259, but the Handle has no score at all</li>
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
@ -352,47 +389,6 @@ DELETE 1
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-04-01">2019-04-01</h2>
<ul>
<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
<ul>
<li>They asked if we had plans to enable RDF support in CGSpace</li>
</ul>
</li>
<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
<ul>
<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
</ul>
</li>
</ul>
<pre><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep 'Spore-192-EN-web.pdf' | grep -E '(18.196.196.108|18.195.78.144|18.195.218.6)' | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
4432 200
</code></pre><ul>
<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.country -m 228 -t ACTION -d
$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.region -m 231 -t action -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p 'fuuu' -m 228 -f cg.coverage.country -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p 'fuuu' -m 231 -f cg.coverage.region -d
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2019-04/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/" rel="prev" role="button">Previous page</a>
@ -417,6 +413,8 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2020-11/">November, 2020</a></li>
<li><a href="/cgspace-notes/2020-10/">October, 2020</a></li>
<li><a href="/cgspace-notes/2020-09/">September, 2020</a></li>
@ -425,8 +423,6 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
<li><a href="/cgspace-notes/2020-07/">July, 2020</a></li>
<li><a href="/cgspace-notes/2020-06/">June, 2020</a></li>
</ol>
</section>

View File

@ -9,12 +9,12 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-10-30T15:24:23+02:00" />
<meta property="og:updated_time" content="2020-11-01T13:11:54+02:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/>
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
<meta name="generator" content="Hugo 0.76.5" />
<meta name="generator" content="Hugo 0.77.0" />
@ -80,6 +80,47 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-04-01">2019-04-01</h2>
<ul>
<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
<ul>
<li>They asked if we had plans to enable RDF support in CGSpace</li>
</ul>
</li>
<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
<ul>
<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
</ul>
</li>
</ul>
<pre><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep 'Spore-192-EN-web.pdf' | grep -E '(18.196.196.108|18.195.78.144|18.195.218.6)' | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
4432 200
</code></pre><ul>
<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.country -m 228 -t ACTION -d
$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.region -m 231 -t action -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p 'fuuu' -m 228 -f cg.coverage.country -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p 'fuuu' -m 231 -f cg.coverage.region -d
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2019-04/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
@ -349,44 +390,6 @@ sys 0m1.979s
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-06-04">2018-06-04</h2>
<ul>
<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
<ul>
<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
</ul>
</li>
<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t correct -m 3 -n
</code></pre><ul>
<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
<li>Time to index ~70,000 items on CGSpace:</li>
</ul>
<pre><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
real 74m42.646s
user 8m5.056s
sys 2m7.289s
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2018-06/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/2/" rel="prev" role="button">Previous page</a>
@ -411,6 +414,8 @@ sys 2m7.289s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2020-11/">November, 2020</a></li>
<li><a href="/cgspace-notes/2020-10/">October, 2020</a></li>
<li><a href="/cgspace-notes/2020-09/">September, 2020</a></li>
@ -419,8 +424,6 @@ sys 2m7.289s
<li><a href="/cgspace-notes/2020-07/">July, 2020</a></li>
<li><a href="/cgspace-notes/2020-06/">June, 2020</a></li>
</ol>
</section>

View File

@ -9,12 +9,12 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-10-30T15:24:23+02:00" />
<meta property="og:updated_time" content="2020-11-01T13:11:54+02:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/>
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
<meta name="generator" content="Hugo 0.76.5" />
<meta name="generator" content="Hugo 0.77.0" />
@ -80,6 +80,44 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-06-04">2018-06-04</h2>
<ul>
<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
<ul>
<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
</ul>
</li>
<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t correct -m 3 -n
</code></pre><ul>
<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
<li>Time to index ~70,000 items on CGSpace:</li>
</ul>
<pre><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
real 74m42.646s
user 8m5.056s
sys 2m7.289s
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2018-06/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
@ -385,6 +423,8 @@ COPY 54701
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2020-11/">November, 2020</a></li>
<li><a href="/cgspace-notes/2020-10/">October, 2020</a></li>
<li><a href="/cgspace-notes/2020-09/">September, 2020</a></li>
@ -393,8 +433,6 @@ COPY 54701
<li><a href="/cgspace-notes/2020-07/">July, 2020</a></li>
<li><a href="/cgspace-notes/2020-06/">June, 2020</a></li>
</ol>
</section>