cgspace-notes/docs/2021-04/index.html

1097 lines
89 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="April, 2021" />
<meta property="og:description" content="2021-04-01
I wrote a script to query Sherpa&rsquo;s API for our ISSNs: sherpa-issn-lookup.py
I&rsquo;m curious to see how the results compare with the results from Crossref yesterday
AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
I simply took everything down with docker-compose and then back up, and then it was OK
Perhaps one of the containers crashed, I should have looked closer but I was in a hurry
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-04/" />
<meta property="article:published_time" content="2021-04-01T09:50:54+03:00" />
<meta property="article:modified_time" content="2021-04-28T18:57:48+03:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="April, 2021"/>
<meta name="twitter:description" content="2021-04-01
I wrote a script to query Sherpa&rsquo;s API for our ISSNs: sherpa-issn-lookup.py
I&rsquo;m curious to see how the results compare with the results from Crossref yesterday
AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
I simply took everything down with docker-compose and then back up, and then it was OK
Perhaps one of the containers crashed, I should have looked closer but I was in a hurry
"/>
<meta name="generator" content="Hugo 0.99.1" />
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "April, 2021",
"url": "https://alanorth.github.io/cgspace-notes/2021-04/",
"wordCount": "4668",
"datePublished": "2021-04-01T09:50:54+03:00",
"dateModified": "2021-04-28T18:57:48+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-04/">
<title>April, 2021 | CGSpace Notes</title>
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.beb8012edc08ba10be012f079d618dc243812267efe62e11f22fe49618f976a4.css" rel="stylesheet" integrity="sha256-vrgBLtwIuhC&#43;AS8HnWGNwkOBImfv5i4R8i/klhj5dqQ=" crossorigin="anonymous">
<!-- minified Font Awesome for SVG icons -->
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
<!-- RSS 2.0 feed -->
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-04/">April, 2021</a></h2>
<p class="blog-post-meta">
<time datetime="2021-04-01T09:50:54+03:00">Thu Apr 01, 2021</time>
in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-04-01">2021-04-01</h2>
<ul>
<li>I wrote a script to query Sherpa&rsquo;s API for our ISSNs: <code>sherpa-issn-lookup.py</code>
<ul>
<li>I&rsquo;m curious to see how the results compare with the results from Crossref yesterday</li>
</ul>
</li>
<li>AReS Explorer was down since this morning, I didn&rsquo;t see anything in the systemd journal
<ul>
<li>I simply took everything down with docker-compose and then back up, and then it was OK</li>
<li>Perhaps one of the containers crashed, I should have looked closer but I was in a hurry</li>
</ul>
</li>
</ul>
<h2 id="2021-04-03">2021-04-03</h2>
<ul>
<li>Biruk from ICT contacted me to say that some CGSpace users still can&rsquo;t log in
<ul>
<li>I guess the CGSpace LDAP bind account is really still locked after last week&rsquo;s reset</li>
<li>He fixed the account and then I was finally able to bind and query:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b <span style="color:#e6db74">&#34;dc=cgiarad,dc=org&#34;</span> -D <span style="color:#e6db74">&#34;cgspace-account&#34;</span> -W <span style="color:#e6db74">&#34;(sAMAccountName=otheraccounttoquery)&#34;</span>
</span></span></code></pre></div><h2 id="2021-04-04">2021-04-04</h2>
<ul>
<li>Check the index aliases on AReS Explorer to make sure they are sane before starting a new harvest:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool | less
</span></span></code></pre></div><ul>
<li>Then set the <code>openrxv-items-final</code> index to read-only so we can make a backup:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true}%
</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-final/_clone/openrxv-items-final-backup
</span></span><span style="display:flex;"><span>{&#34;acknowledged&#34;:true,&#34;shards_acknowledged&#34;:true,&#34;index&#34;:&#34;openrxv-items-final-backup&#34;}%
</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-final/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
</span></span></code></pre></div><ul>
<li>Then start a harvesting on AReS Explorer</li>
<li>Help Enrico get some 2020 statistics for the Roots, Tubers and Bananas (RTB) community on CGSpace
<ul>
<li>He was hitting <a href="https://github.com/ilri/OpenRXV/issues/66">a bug on AReS</a> and also he only needed stats for 2020, and AReS currently only gives all-time stats</li>
</ul>
</li>
<li>I cleaned up about 230 ISSNs on CGSpace in OpenRefine
<ul>
<li>I had exported them last week, then filtered for anything not looking like an ISSN with this GREL: <code>isNotNull(value.match(/^\p{Alnum}{4}-\p{Alnum}{4}$/))</code></li>
<li>Then I applied them on CGSpace with the <code>fix-metadata-values.py</code> script:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/2021-04-01-ISSNs.csv -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -f cg.issn -t <span style="color:#e6db74">&#39;correct&#39;</span> -m <span style="color:#ae81ff">253</span>
</span></span></code></pre></div><ul>
<li>For now I only fixed obvious errors like &ldquo;1234-5678.&rdquo; and &ldquo;e-ISSN: 1234-5678&rdquo; etc, but there are still lots of invalid ones which need more manual work:
<ul>
<li>Too few characters</li>
<li>Too many characters</li>
<li>ISBNs</li>
</ul>
</li>
<li>Create the CGSpace community and collection structure for the new Accelerating Impacts of CGIAR Climate Research for Africa (AICCRA) and assign all workflow steps</li>
</ul>
<h2 id="2021-04-05">2021-04-05</h2>
<ul>
<li>The AReS Explorer harvesting from yesterday finished, and the results look OK, but actually the Elasticsearch indexes are messed up again:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> &#34;openrxv-items-final&#34;: {
</span></span><span style="display:flex;"><span> &#34;aliases&#34;: {}
</span></span><span style="display:flex;"><span> },
</span></span><span style="display:flex;"><span> &#34;openrxv-items-temp&#34;: {
</span></span><span style="display:flex;"><span> &#34;aliases&#34;: {
</span></span><span style="display:flex;"><span> &#34;openrxv-items&#34;: {}
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> },
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><ul>
<li><code>openrxv-items</code> should be an alias of <code>openrxv-items-final</code>, not <code>openrxv-temp</code>&hellip; I will have to fix that manually</li>
<li>Enrico asked for more information on the RTB stats I gave him yesterday
<ul>
<li>I remembered (again) that we can&rsquo;t filter Atmire&rsquo;s CUA stats by date issued</li>
<li>To show, for example, views/downloads in the year 2020 for RTB issued in 2020, we would need to use the DSpace statistics API and post a list of IDs and a custom date range</li>
<li>I tried to do that here by exporting the RTB community and extracting the IDs for items issued in 2020:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ~/dspace63/bin/dspace metadata-export -i 10568/80100 -f /tmp/rtb.csv
</span></span><span style="display:flex;"><span>$ csvcut -c <span style="color:#e6db74">&#39;id,dcterms.issued,dcterms.issued[],dcterms.issued[en_US]&#39;</span> /tmp/rtb.csv | <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> sed &#39;1d&#39; | \
</span></span><span style="display:flex;"><span> csvsql --no-header --no-inference --query &#39;SELECT a AS id,COALESCE(b, &#34;&#34;)||COALESCE(c, &#34;&#34;)||COALESCE(d, &#34;&#34;) AS issued FROM stdin&#39; | \
</span></span><span style="display:flex;"><span> csvgrep -c issued -m 2020 | \
</span></span><span style="display:flex;"><span> csvcut -c id | \
</span></span><span style="display:flex;"><span> sed &#39;1d&#39; | \
</span></span><span style="display:flex;"><span> sort | \
</span></span><span style="display:flex;"><span> uniq
</span></span></code></pre></div><ul>
<li>So I remember in the future, this basically does the following:
<ul>
<li>Use csvcut to extract the id and all date issued columns from the CSV</li>
<li>Use sed to remove the header so we can refer to the columns using default a, b, c instead of their real names (which are tricky to match due to special characters)</li>
<li>Use csvsql to concatenate the various date issued columns (coalescing where null)</li>
<li>Use csvgrep to filter items by date issued in 2020</li>
<li>Use csvcut to extract the id column</li>
<li>Use sed to delete the header row</li>
<li>Use sort and uniq to filter out any duplicate IDs (there were three)</li>
</ul>
</li>
<li>Then I have a list of 296 IDs for RTB items issued in 2020</li>
<li>I constructed a JSON file to post to the DSpace Statistics API:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&#34;limit&#34;</span>: <span style="color:#ae81ff">100</span>,
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&#34;page&#34;</span>: <span style="color:#ae81ff">0</span>,
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&#34;dateFrom&#34;</span>: <span style="color:#e6db74">&#34;2020-01-01T00:00:00Z&#34;</span>,
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&#34;dateTo&#34;</span>: <span style="color:#e6db74">&#34;2020-12-31T00:00:00Z&#34;</span>,
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&#34;items&#34;</span>: [
</span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;00358715-b70c-4fdd-aa55-730e05ba739e&#34;</span>,
</span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;004b54bb-f16f-4cec-9fbc-ab6c6345c43d&#34;</span>,
</span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;02fb7630-d71a-449e-b65d-32b4ea7d6904&#34;</span>,
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">...</span>
</span></span><span style="display:flex;"><span> ]
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><ul>
<li>Then I submitted the file three times (changing the page parameter):</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s -d @/tmp/2020-items.txt https://cgspace.cgiar.org/rest/statistics/items | json_pp &gt; /tmp/page1.json
</span></span><span style="display:flex;"><span>$ curl -s -d @/tmp/2020-items.txt https://cgspace.cgiar.org/rest/statistics/items | json_pp &gt; /tmp/page2.json
</span></span><span style="display:flex;"><span>$ curl -s -d @/tmp/2020-items.txt https://cgspace.cgiar.org/rest/statistics/items | json_pp &gt; /tmp/page3.json
</span></span></code></pre></div><ul>
<li>Then I extracted the views and downloads in the most ridiculous way:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep views /tmp/page*.json | grep -o -E <span style="color:#e6db74">&#39;[0-9]+$&#39;</span> | sed <span style="color:#e6db74">&#39;s/,//&#39;</span> | xargs | sed -e <span style="color:#e6db74">&#39;s/ /+/g&#39;</span> | bc
</span></span><span style="display:flex;"><span>30364
</span></span><span style="display:flex;"><span>$ grep downloads /tmp/page*.json | grep -o -E <span style="color:#e6db74">&#39;[0-9]+,&#39;</span> | sed <span style="color:#e6db74">&#39;s/,//&#39;</span> | xargs | sed -e <span style="color:#e6db74">&#39;s/ /+/g&#39;</span> | bc
</span></span><span style="display:flex;"><span>9100
</span></span></code></pre></div><ul>
<li>For curiousity I did the same exercise for items issued in 2019 and got the following:
<ul>
<li>Views: 30721</li>
<li>Downloads: 10205</li>
</ul>
</li>
</ul>
<h2 id="2021-04-06">2021-04-06</h2>
<ul>
<li>Margarita from CCAFS was having problems deleting an item from CGSpace again
<ul>
<li>The error was &ldquo;Authorization denied for action OBSOLETE (DELETE) on BITSTREAM:bd157345-448e &hellip;&rdquo;</li>
<li>This is the same issue as last month</li>
</ul>
</li>
<li>Create a new collection on CGSpace for a new CIP project at Mishel Portilla&rsquo;s request</li>
<li>I got a notice that CGSpace was down
<ul>
<li>I didn&rsquo;t see anything strange at first, but there are an insane amount of database connections:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>12413
</span></span></code></pre></div><ul>
<li>The system journal shows thousands of these messages in the system journal, this is the first one:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Apr 06 07:52:13 linode18 tomcat7[556]: Apr 06, 2021 7:52:13 AM org.apache.tomcat.jdbc.pool.ConnectionPool abandon
</span></span></code></pre></div><ul>
<li>Around that time in the dspace log I see nothing unusual, but maybe these?</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-06 07:52:29,409 INFO com.atmire.dspace.cua.CUASolrLoggerServiceImpl @ Updating : 200/127 docs in http://localhost:8081/solr/statistics
</span></span></code></pre></div><ul>
<li>(BTW what is the deal with the &ldquo;200/127&rdquo;? I should send a comment to Atmire)
<ul>
<li>I file a ticket with Atmire: <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-tickets">https://tracker.atmire.com/tickets-cgiar-ilri/view-tickets</a></li>
</ul>
</li>
<li>I restarted the PostgreSQL and Tomcat services and now I see less connections, but still WAY high:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>3640
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>2968
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>13
</span></span></code></pre></div><ul>
<li>After ten minutes or so it went back down&hellip;</li>
<li>And now it&rsquo;s back up in the thousands&hellip; I am seeing a lot of stuff in dspace log like this:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-06 11:59:34,364 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717951
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717952
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717953
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717954
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717955
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717956
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717957
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717958
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717959
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717960
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717961
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717962
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717963
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717964
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717965
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717966
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717967
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717968
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717969
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717970
</span></span><span style="display:flex;"><span>2021-04-06 11:59:34,365 INFO org.dspace.content.MetadataValueServiceImpl @ user.hidden@cgiar.org:session_id=65F32E67CE8E347F64EFB5EB4E349B9B:delete_metadata_value: metadata_value_id=5717971
</span></span></code></pre></div><ul>
<li>I sent some notes and a log to Atmire on our existing issue about the database stuff
<ul>
<li>Also I asked them about the possibility of doing a formal review of Hibernate</li>
</ul>
</li>
<li>Falcon 3.0.0 was released so I updated the 3.0.0 branch for dspace-statistics-api and merged it to <code>v6_x</code>
<ul>
<li>I also fixed one minor (unrelated) bug in the tests</li>
<li>Then I deployed the new version on DSpace Test</li>
</ul>
</li>
<li>I had a meeting with Peter and Abenet about CGSpace TODOs</li>
<li>CGSpace went down again and the PostgreSQL locks are through the roof:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>12154
</span></span></code></pre></div><ul>
<li>I don&rsquo;t see any activity on REST API, but in the last four hours there have been 3,500 DSpace sessions:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -a -E <span style="color:#e6db74">&#39;2021-04-06 (13|14|15|16|17):&#39;</span> /home/cgspace.cgiar.org/log/dspace.log.2021-04-06 | grep -o -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}&#39;</span> | sort | uniq | wc -l
</span></span><span style="display:flex;"><span>3547
</span></span></code></pre></div><ul>
<li>I looked at the same time of day for the past few weeks and it seems to be a normal number of sessions:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> file in /home/cgspace.cgiar.org/log/dspace.log.2021-0<span style="color:#f92672">{</span>3,4<span style="color:#f92672">}</span>-*; <span style="color:#66d9ef">do</span> grep -a -E <span style="color:#e6db74">&#34;2021-0(3|4)-[0-9]{2} (13|14|15|16|17):&#34;</span> <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -o -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}&#39;</span> | sort | uniq | wc -l; <span style="color:#66d9ef">done</span>
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span>3572
</span></span><span style="display:flex;"><span>4085
</span></span><span style="display:flex;"><span>3476
</span></span><span style="display:flex;"><span>3128
</span></span><span style="display:flex;"><span>2949
</span></span><span style="display:flex;"><span>2016
</span></span><span style="display:flex;"><span>1839
</span></span><span style="display:flex;"><span>4513
</span></span><span style="display:flex;"><span>3463
</span></span><span style="display:flex;"><span>4425
</span></span><span style="display:flex;"><span>3328
</span></span><span style="display:flex;"><span>2783
</span></span><span style="display:flex;"><span>3898
</span></span><span style="display:flex;"><span>3848
</span></span><span style="display:flex;"><span>7799
</span></span><span style="display:flex;"><span>255
</span></span><span style="display:flex;"><span>534
</span></span><span style="display:flex;"><span>2755
</span></span><span style="display:flex;"><span>599
</span></span><span style="display:flex;"><span>4463
</span></span><span style="display:flex;"><span>3547
</span></span></code></pre></div><ul>
<li>What about total number of sessions per day?</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> file in /home/cgspace.cgiar.org/log/dspace.log.2021-0<span style="color:#f92672">{</span>3,4<span style="color:#f92672">}</span>-*; <span style="color:#66d9ef">do</span> echo <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">:&#34;</span>; grep -a -o -E <span style="color:#e6db74">&#39;session_id=[A-Z0-9]{32}&#39;</span> <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | sort | uniq | wc -l; <span style="color:#66d9ef">done</span>
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-28:
</span></span><span style="display:flex;"><span>11784
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-29:
</span></span><span style="display:flex;"><span>15104
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-30:
</span></span><span style="display:flex;"><span>19396
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-03-31:
</span></span><span style="display:flex;"><span>32612
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-01:
</span></span><span style="display:flex;"><span>26037
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-02:
</span></span><span style="display:flex;"><span>14315
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-03:
</span></span><span style="display:flex;"><span>12530
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-04:
</span></span><span style="display:flex;"><span>13138
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-05:
</span></span><span style="display:flex;"><span>16756
</span></span><span style="display:flex;"><span>/home/cgspace.cgiar.org/log/dspace.log.2021-04-06:
</span></span><span style="display:flex;"><span>12343
</span></span></code></pre></div><ul>
<li>So it&rsquo;s not the number of sessions&hellip; it&rsquo;s something with the workload&hellip;</li>
<li>I had to step away for an hour or so and when I came back the site was still down and there were still 12,000 locks
<ul>
<li>I restarted postgresql and tomcat7&hellip;</li>
</ul>
</li>
<li>The locks in PostgreSQL shot up again&hellip;</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>3447
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>3527
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>4582
</span></span></code></pre></div><ul>
<li>I don&rsquo;t know what the hell is going on, but the PostgreSQL connections and locks are way higher than ever before:</li>
</ul>
<p><img src="/cgspace-notes/2021/04/postgres_connections_cgspace-week.png" alt="PostgreSQL connections week">
<img src="/cgspace-notes/2021/04/postgres_locks_cgspace-week.png" alt="PostgreSQL locks week">
<img src="/cgspace-notes/2021/04/jmx_tomcat_dbpools-week.png" alt="Tomcat database pool"></p>
<ul>
<li>Otherwise, the number of DSpace sessions is completely normal:</li>
</ul>
<p><img src="/cgspace-notes/2021/04/jmx_dspace_sessions-week.png" alt="DSpace sessions"></p>
<ul>
<li>While looking at the nginx logs I see that MEL is trying to log into CGSpace&rsquo;s REST API and delete items:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>34.209.213.122 - - [06/Apr/2021:03:50:46 +0200] &#34;POST /rest/login HTTP/1.1&#34; 401 727 &#34;-&#34; &#34;MEL&#34;
</span></span><span style="display:flex;"><span>34.209.213.122 - - [06/Apr/2021:03:50:48 +0200] &#34;DELETE /rest/items/95f52bf1-f082-4e10-ad57-268a76ca18ec/metadata HTTP/1.1&#34; 401 704 &#34;-&#34; &#34;-&#34;
</span></span></code></pre></div><ul>
<li>I see a few of these per day going back several months
<ul>
<li>I sent a message to Salem and Enrico to ask if they know</li>
</ul>
</li>
<li>Also annoying, I see tons of what look like penetration testing requests from Qualys:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-04 06:35:17,889 INFO org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:failed_login:no DN found for user &#34;&#39;&gt;&lt;qss a=X158062356Y1_2Z&gt;
</span></span><span style="display:flex;"><span>2021-04-04 06:35:17,889 INFO org.dspace.authenticate.PasswordAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:authenticate:attempting password auth of user=&#34;&#39;&gt;&lt;qss a=X158062356Y1_2Z&gt;
</span></span><span style="display:flex;"><span>2021-04-04 06:35:17,890 INFO org.dspace.app.xmlui.utils.AuthenticationUtil @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:failed_login:email=&#34;&#39;&gt;&lt;qss a=X158062356Y1_2Z&gt;, realm=null, result=2
</span></span><span style="display:flex;"><span>2021-04-04 06:35:18,145 INFO org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:auth:attempting trivial auth of user=was@qualys.com
</span></span><span style="display:flex;"><span>2021-04-04 06:35:18,519 INFO org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:failed_login:no DN found for user was@qualys.com
</span></span><span style="display:flex;"><span>2021-04-04 06:35:18,520 INFO org.dspace.authenticate.PasswordAuthentication @ anonymous:session_id=FF1E051BCA7D81CC5A807D85380D81E5:ip_addr=64.39.108.48:authenticate:attempting password auth of user=was@qualys.com
</span></span></code></pre></div><ul>
<li>I deleted the ilri/AReS repository on GitHub since we haven&rsquo;t updated it in two years
<ul>
<li>All development is happening in <a href="https://github.com/ilri/openRXV">https://github.com/ilri/openRXV</a> now</li>
</ul>
</li>
<li>10PM and the server is down again, with locks through the roof:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>12198
</span></span></code></pre></div><ul>
<li>I see that there are tons of PostgreSQL connections getting abandoned today, compared to very few in the past few weeks:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ journalctl -u tomcat7 --since<span style="color:#f92672">=</span>today | grep -c <span style="color:#e6db74">&#39;ConnectionPool abandon&#39;</span>
</span></span><span style="display:flex;"><span>1838
</span></span><span style="display:flex;"><span>$ journalctl -u tomcat7 --since<span style="color:#f92672">=</span>2021-03-20 --until<span style="color:#f92672">=</span>2021-04-05 | grep -c <span style="color:#e6db74">&#39;ConnectionPool abandon&#39;</span>
</span></span><span style="display:flex;"><span>3
</span></span></code></pre></div><ul>
<li>I even restarted the server and connections were low for a few minutes until they shot back up:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>13
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>8651
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>8940
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>10504
</span></span></code></pre></div><ul>
<li>I had to go to bed and I bet it will crash and be down for hours until I wake up&hellip;</li>
<li>What the hell is this user agent?</li>
</ul>
<pre tabindex="0"><code>54.197.119.143 - - [06/Apr/2021:19:18:11 +0200] &#34;GET /handle/10568/16499 HTTP/1.1&#34; 499 0 &#34;-&#34; &#34;GetUrl/1.0 wdestiny@umich.edu (Linux)&#34;
</code></pre><h2 id="2021-04-07">2021-04-07</h2>
<ul>
<li>CGSpace was still down from last night of course, with tons of database locks:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>12168
</span></span></code></pre></div><ul>
<li>I restarted the server again and the locks came back</li>
<li>Atmire responded to the message from yesterday
<ul>
<li>The noticed something in the logs about emails failing to be sent</li>
<li>There appears to be an issue sending mails on workflow tasks when a user in that group has an invalid email address:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>2021-04-01 12:45:11,414 WARN org.dspace.workflowbasic.BasicWorkflowServiceImpl @ a.akwarandu@cgiar.org:session_id=2F20F20D4A8C36DB53D42DE45DFA3CCE:notifyGroupofTask:cannot email user group_id=aecf811b-b7e9-4b6f-8776-3d372e6a048b workflow_item_id=33085\colon; Invalid Addresses (com.sun.mail.smtp.SMTPAddressFailedException\colon; 501 5.1.3 Invalid address
</span></span></code></pre></div><ul>
<li>The issue is not the named user above, but a member of the group&hellip;</li>
<li>And the group does have users with invalid email addresses (probably accounts created automatically after authenticating with LDAP):</li>
</ul>
<p><img src="/cgspace-notes/2021/04/group-invalid-email.png" alt="DSpace group"></p>
<ul>
<li>I extracted all the group IDs from recent logs that had users with invalid email addresses:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ grep -a -E <span style="color:#e6db74">&#39;email user group_id=\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b&#39;</span> /home/cgspace.cgiar.org/log/dspace.log.* | grep -o -E <span style="color:#e6db74">&#39;\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b&#39;</span> | sort | uniq
</span></span><span style="display:flex;"><span>0a30d6ae-74a6-4eee-a8f5-ee5d15192ee6
</span></span><span style="display:flex;"><span>1769137c-36d4-42b2-8fec-60585e110db7
</span></span><span style="display:flex;"><span>203c8614-8a97-4ac8-9686-d9d62cb52acc
</span></span><span style="display:flex;"><span>294603de-3d09-464e-a5b0-09e452c6b5ab
</span></span><span style="display:flex;"><span>35878555-9623-4679-beb8-bb3395fdf26e
</span></span><span style="display:flex;"><span>3d8a5efa-5509-4bf9-9374-2bc714aceb99
</span></span><span style="display:flex;"><span>4238208a-f848-47cb-9dd2-43f9f954a4af
</span></span><span style="display:flex;"><span>44939b84-1894-41e7-b3e6-8c8d1781057b
</span></span><span style="display:flex;"><span>49ba087e-75a3-45ce-805c-69eeda0f786b
</span></span><span style="display:flex;"><span>4a6606ce-0284-421d-bf80-4dafddba2d42
</span></span><span style="display:flex;"><span>527de6aa-9cd0-4988-bf5f-c9c92ba2ac10
</span></span><span style="display:flex;"><span>54cd1b16-65bf-4041-9d84-fb2ea3301d6d
</span></span><span style="display:flex;"><span>58982847-5f7c-4b8b-a7b0-4d4de702136e
</span></span><span style="display:flex;"><span>5f0b85be-bd23-47de-927d-bca368fa1fbc
</span></span><span style="display:flex;"><span>646ada17-e4ef-49f6-9378-af7e58596ce1
</span></span><span style="display:flex;"><span>7e2f4bf8-fbc9-4b2f-97a4-75e5427bef90
</span></span><span style="display:flex;"><span>8029fd53-f9f5-4107-bfc3-8815507265cf
</span></span><span style="display:flex;"><span>81faa934-c602-4608-bf45-de91845dfea7
</span></span><span style="display:flex;"><span>8611a462-210c-4be1-a5bb-f87a065e6113
</span></span><span style="display:flex;"><span>8855c903-ef86-433c-b0be-c12300eb0f84
</span></span><span style="display:flex;"><span>8c7ece98-3598-4de7-a885-d61fd033bea8
</span></span><span style="display:flex;"><span>8c9a0d01-2d12-4a99-84f9-cdc25ac072f9
</span></span><span style="display:flex;"><span>8f9f888a-b501-41f3-a462-4da16150eebf
</span></span><span style="display:flex;"><span>94168f0e-9f45-4112-ac8d-3ba9be917842
</span></span><span style="display:flex;"><span>96998038-f381-47dc-8488-ff7252703627
</span></span><span style="display:flex;"><span>9768f4a8-3018-44e9-bf58-beba4296327c
</span></span><span style="display:flex;"><span>9a99e8d2-558e-4fc1-8011-e4411f658414
</span></span><span style="display:flex;"><span>a34e6400-78ed-45c0-a751-abc039eed2e6
</span></span><span style="display:flex;"><span>a9da5af3-4ec7-4a9b-becb-6e3d028d594d
</span></span><span style="display:flex;"><span>abf5201c-8be5-4dee-b461-132203dd51cb
</span></span><span style="display:flex;"><span>adb5658c-cef3-402f-87b6-b498f580351c
</span></span><span style="display:flex;"><span>aecf811b-b7e9-4b6f-8776-3d372e6a048b
</span></span><span style="display:flex;"><span>ba5aae61-ea34-4ac1-9490-4645acf2382f
</span></span><span style="display:flex;"><span>bf7f3638-c7c6-4a8f-893d-891a6d3dafff
</span></span><span style="display:flex;"><span>c617ada0-09d1-40ed-b479-1c4860a4f724
</span></span><span style="display:flex;"><span>cff91d44-a855-458c-89e5-bd48c17d1a54
</span></span><span style="display:flex;"><span>e65171ae-a2bf-4043-8f54-f8457bc9174b
</span></span><span style="display:flex;"><span>e7098b40-4701-4ca2-b9a9-3a1282f67044
</span></span><span style="display:flex;"><span>e904f122-71dc-439b-b877-313ef62486d7
</span></span><span style="display:flex;"><span>ede59734-adac-4c01-8691-b45f19088d37
</span></span><span style="display:flex;"><span>f88bd6bb-f93f-41cb-872f-ff26f6237068
</span></span><span style="display:flex;"><span>f985f5fb-be5c-430b-a8f1-cf86ae4fc49a
</span></span><span style="display:flex;"><span>fe800006-aaec-4f9e-9ab4-f9475b4cbdc3
</span></span></code></pre></div><h2 id="2021-04-08">2021-04-08</h2>
<ul>
<li>I can&rsquo;t believe it but the server has been down for twelve hours or so
<ul>
<li>The locks have not changed since I went to bed last night:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>12070
</span></span></code></pre></div><ul>
<li>I restarted PostgreSQL and Tomcat and the locks go straight back up!</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>13
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>986
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>1194
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>1212
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>1489
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>2124
</span></span><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | wc -l
</span></span><span style="display:flex;"><span>5934
</span></span></code></pre></div><h2 id="2021-04-09">2021-04-09</h2>
<ul>
<li>Atmire managed to get CGSpace back up by killing all the PostgreSQL connections yesterday
<ul>
<li>I don&rsquo;t know how they did it&hellip;</li>
<li>They also think it&rsquo;s weird that restarting PostgreSQL didn&rsquo;t kill the connections</li>
<li>They asked some more questions, like for example if there were also issues on DSpace Test</li>
<li>Strangely enough, I checked DSpace Test and notice a clear spike in PostgreSQL locks on the morning of April 6th as well!</li>
</ul>
</li>
</ul>
<p><img src="/cgspace-notes/2021/04/postgres_locks_ALL-week-PROD.png" alt="PostgreSQL locks week CGSpace">
<img src="/cgspace-notes/2021/04/postgres_locks_ALL-week-TEST.png" alt="PostgreSQL locks week DSpace Test"></p>
<ul>
<li>I definitely need to look into that!</li>
</ul>
<h2 id="2021-04-11">2021-04-11</h2>
<ul>
<li>I am trying to resolve the AReS Elasticsearch index issues that happened last week
<ul>
<li>I decided to back up the <code>openrxv-items</code> index to <code>openrxv-items-backup</code> and then delete all the others:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-backup
</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: false}}&#39;</span>
</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
</span></span></code></pre></div><ul>
<li>Then I updated all Docker containers and rebooted the server (linode20) so that the correct indexes would be created again:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
</span></span></code></pre></div><ul>
<li>Then I realized I have to clone the backup index directly to <code>openrxv-items-final</code>, and re-create the <code>openrxv-items</code> alias:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
</span></span><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-backup/_settings&#34;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;settings&#34;: {&#34;index.blocks.write&#34;: true}}&#39;</span>
</span></span><span style="display:flex;"><span>$ curl -s -X POST http://localhost:9200/openrxv-items-backup/_clone/openrxv-items-final
</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
</span></span></code></pre></div><ul>
<li>Now I see both <code>openrxv-items-final</code> and <code>openrxv-items</code> have the current number of items:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> &#34;count&#34; : 103373,
</span></span><span style="display:flex;"><span> &#34;_shards&#34; : {
</span></span><span style="display:flex;"><span> &#34;total&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;successful&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;skipped&#34; : 0,
</span></span><span style="display:flex;"><span> &#34;failed&#34; : 0
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final/_count?q=*&amp;pretty&#39;</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> &#34;count&#34; : 103373,
</span></span><span style="display:flex;"><span> &#34;_shards&#34; : {
</span></span><span style="display:flex;"><span> &#34;total&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;successful&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;skipped&#34; : 0,
</span></span><span style="display:flex;"><span> &#34;failed&#34; : 0
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><ul>
<li>Then I started a fresh harvesting in the AReS Explorer admin dashboard</li>
</ul>
<h2 id="2021-04-12">2021-04-12</h2>
<ul>
<li>The harvesting on AReS finished last night, but the indexes got messed up again
<ul>
<li>I will have to fix them manually next time&hellip;</li>
</ul>
</li>
</ul>
<h2 id="2021-04-13">2021-04-13</h2>
<ul>
<li>Looking into the logs on 2021-04-06 on CGSpace and DSpace Test to see if there is anything specific that stands out about the activty on those days that would cause the PostgreSQL issues
<ul>
<li>Digging into the Munin graphs for the last week I found a few other things happening on that morning:</li>
</ul>
</li>
</ul>
<p><img src="/cgspace-notes/2021/04/sda-week.png" alt="/dev/sda disk latency week">
<img src="/cgspace-notes/2021/04/classes_unloaded-week.png" alt="JVM classes unloaded week">
<img src="/cgspace-notes/2021/04/nginx_status-week.png" alt="Nginx status week"></p>
<ul>
<li>13,000 requests in the last two months from a user with user agent <code>SomeRandomText</code>, for example:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>84.33.2.97 - - [06/Apr/2021:06:25:13 +0200] &#34;GET /bitstream/handle/10568/77776/CROP%20SCIENCE.jpg.jpg HTTP/1.1&#34; 404 10890 &#34;-&#34; &#34;SomeRandomText&#34;
</span></span></code></pre></div><ul>
<li>I purged them:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-hits.sh -f /tmp/agents.txt -p
</span></span><span style="display:flex;"><span>Purging 13159 hits from SomeRandomText in statistics
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 13159
</span></span></code></pre></div><ul>
<li>I noticed there were 78 items submitted in the hour before CGSpace crashed:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># grep -a -E <span style="color:#e6db74">&#39;2021-04-06 0(6|7):&#39;</span> /home/cgspace.cgiar.org/log/dspace.log.2021-04-06 | grep -c -a add_item
</span></span><span style="display:flex;"><span>78
</span></span></code></pre></div><ul>
<li>Of those 78, 77 of them were from Udana</li>
<li>Compared to other mornings (0 to 9 AM) this month that seems to be pretty high:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># <span style="color:#66d9ef">for</span> num in <span style="color:#f92672">{</span>01..13<span style="color:#f92672">}</span>; <span style="color:#66d9ef">do</span> grep -a -E <span style="color:#e6db74">&#34;2021-04-</span>$num<span style="color:#e6db74"> 0&#34;</span> /home/cgspace.cgiar.org/log/dspace.log.2021-04-$num | grep -c -a
</span></span><span style="display:flex;"><span> add_item; done
</span></span><span style="display:flex;"><span>32
</span></span><span style="display:flex;"><span>0
</span></span><span style="display:flex;"><span>0
</span></span><span style="display:flex;"><span>2
</span></span><span style="display:flex;"><span>8
</span></span><span style="display:flex;"><span>108
</span></span><span style="display:flex;"><span>4
</span></span><span style="display:flex;"><span>0
</span></span><span style="display:flex;"><span>29
</span></span><span style="display:flex;"><span>0
</span></span><span style="display:flex;"><span>1
</span></span><span style="display:flex;"><span>1
</span></span><span style="display:flex;"><span>2
</span></span></code></pre></div><h2 id="2021-04-15">2021-04-15</h2>
<ul>
<li>Release v1.4.2 of the DSpace Statistics API on GitHub: <a href="https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.2">https://github.com/ilri/dspace-statistics-api/releases/tag/v1.4.2</a>
<ul>
<li>This has been running on DSpace Test for the last week or so, and mostly contains the Falcon 3.0.0 changes</li>
</ul>
</li>
<li>Re-sync DSpace Test with data from CGSpace
<ul>
<li>Run system updates on DSpace Test (linode26) and reboot the server</li>
</ul>
</li>
<li>Update the PostgreSQL JDBC driver on DSpace Test (linode26) to 42.2.19
<ul>
<li>It has been a few months since we updated this, and there have been a few releases since 42.2.14 that we are currently using</li>
</ul>
</li>
<li>Create a test account for Rafael from Bioversity-CIAT to submit some items to DSpace Test:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace user -a -m tip-submit@cgiar.org -g CIAT -s Submit -p <span style="color:#e6db74">&#39;fuuuuuuuu&#39;</span>
</span></span></code></pre></div><ul>
<li>I added the account to the Alliance Admins account, which is should allow him to submit to any Alliance collection
<ul>
<li>According to my notes from <a href="/cgspace-notes/2020-10/">2020-10</a> the account must be in the admin group in order to submit via the REST API</li>
</ul>
</li>
</ul>
<h2 id="2021-04-18">2021-04-18</h2>
<ul>
<li>Update all containers on AReS (linode20):</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | xargs -L1 docker pull
</span></span></code></pre></div><ul>
<li>Then run all system updates and reboot the server</li>
<li>I learned a new command for Elasticsearch:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl http://localhost:9200/_cat/indices
</span></span><span style="display:flex;"><span>yellow open openrxv-values ChyhGwMDQpevJtlNWO1vcw 1 1 1579 0 537.6kb 537.6kb
</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp PhV5ieuxQsyftByvCxzSIw 1 1 103585 104372 482.7mb 482.7mb
</span></span><span style="display:flex;"><span>yellow open openrxv-shared J_8cxIz6QL6XTRZct7UBBQ 1 1 127 0 115.7kb 115.7kb
</span></span><span style="display:flex;"><span>yellow open openrxv-values-00001 jAoXTLR0R9mzivlDVbQaqA 1 1 3903 0 696.2kb 696.2kb
</span></span><span style="display:flex;"><span>green open .kibana_task_manager_1 O1zgJ0YlQhKCFAwJZaNSIA 1 0 2 2 20.6kb 20.6kb
</span></span><span style="display:flex;"><span>yellow open openrxv-users 1hWGXh9kS_S6YPxAaBN8ew 1 1 5 0 28.6kb 28.6kb
</span></span><span style="display:flex;"><span>green open .apm-agent-configuration f3RAkSEBRGaxJZs3ePVxsA 1 0 0 0 283b 283b
</span></span><span style="display:flex;"><span>yellow open openrxv-items-final sgk-s8O-RZKdcLRoWt3G8A 1 1 970 0 2.3mb 2.3mb
</span></span><span style="display:flex;"><span>green open .kibana_1 HHPN7RD_T7qe0zDj4rauQw 1 0 25 7 36.8kb 36.8kb
</span></span><span style="display:flex;"><span>yellow open users M0t2LaZhSm2NrF5xb64dnw 1 1 2 0 11.6kb 11.6kb
</span></span></code></pre></div><ul>
<li>Somehow the <code>openrxv-items-final</code> index only has a few items and the majority are in <code>openrxv-items-temp</code>, via the <code>openrxv-items</code> alias (which is in the temp index):</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> &#34;count&#34; : 103585,
</span></span><span style="display:flex;"><span> &#34;_shards&#34; : {
</span></span><span style="display:flex;"><span> &#34;total&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;successful&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;skipped&#34; : 0,
</span></span><span style="display:flex;"><span> &#34;failed&#34; : 0
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><ul>
<li>I found a cool tool to help with exporting and restoring Elasticsearch indexes:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span>Sun, 18 Apr 2021 06:27:07 GMT | Total Writes: 103585
</span></span><span style="display:flex;"><span>Sun, 18 Apr 2021 06:27:07 GMT | dump complete
</span></span></code></pre></div><ul>
<li>It took only two or three minutes to export everything&hellip;</li>
<li>I did a test to restore the index:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-test --type<span style="color:#f92672">=</span>mapping
</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-test --limit <span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
</span></span></code></pre></div><ul>
<li>So that&rsquo;s pretty cool!</li>
<li>I deleted the <code>openrxv-items-final</code> index and <code>openrxv-items-temp</code> indexes and then restored the mappings to <code>openrxv-items-final</code>, added the <code>openrxv-items</code> alias, and started restoring the data to <code>openrxv-items</code> with elasticdump:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-final --type<span style="color:#f92672">=</span>mapping
</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --limit <span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
</span></span></code></pre></div><ul>
<li>AReS seems to be working fine аfter that, so I created the <code>openrxv-items-temp</code> index and then started a fresh harvest on AReS Explorer:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -X PUT <span style="color:#e6db74">&#34;localhost:9200/openrxv-items-temp&#34;</span>
</span></span></code></pre></div><ul>
<li>Run system updates on CGSpace (linode18) and run the latest Ansible infrastructure playbook to update the DSpace Statistics API, PostgreSQL JDBC driver, etc, and then reboot the system</li>
<li>I wasted a bit of time trying to get TSLint and then ESLint running for OpenRXV on GitHub Actions</li>
</ul>
<h2 id="2021-04-19">2021-04-19</h2>
<ul>
<li>The AReS harvesting last night seems to have completed successfully, but the number of results is strange:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp kNUlupUyS_i7vlBGiuVxwg 1 1 103741 105553 483.6mb 483.6mb
</span></span><span style="display:flex;"><span>yellow open openrxv-items-final HFc3uytTRq2GPpn13vkbmg 1 1 970 0 2.3mb 2.3mb
</span></span></code></pre></div><ul>
<li>The indices endpoint doesn&rsquo;t include the <code>openrxv-items</code> alias, but it is currently in the <code>openrxv-items-temp</code> index so the number of items is the same:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items/_count?q=*&amp;pretty&#39;</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> &#34;count&#34; : 103741,
</span></span><span style="display:flex;"><span> &#34;_shards&#34; : {
</span></span><span style="display:flex;"><span> &#34;total&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;successful&#34; : 1,
</span></span><span style="display:flex;"><span> &#34;skipped&#34; : 0,
</span></span><span style="display:flex;"><span> &#34;failed&#34; : 0
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><ul>
<li>A user was having problems resetting their password on CGSpace, with some message about SMTP etc
<ul>
<li>I checked and we are indeed locked out of our mailbox:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace test-email
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span>Error sending email:
</span></span><span style="display:flex;"><span> - Error: javax.mail.SendFailedException: Send failure (javax.mail.AuthenticationFailedException: 550 5.2.1 Mailbox cannot be accessed [PR0P264CA0280.FRAP264.PROD.OUTLOOK.COM]
</span></span><span style="display:flex;"><span>)
</span></span></code></pre></div><ul>
<li>I have to write to ICT&hellip;</li>
<li>I decided to switch back to the G1GC garbage collector on DSpace Test
<ul>
<li>Reading Shawn Heisy&rsquo;s discussion again: <a href="https://cwiki.apache.org/confluence/display/SOLR/ShawnHeisey">https://cwiki.apache.org/confluence/display/SOLR/ShawnHeisey</a></li>
<li>I am curious to check the JVM stats in a few days to see if there is a marked change</li>
</ul>
</li>
<li>Work on minor changes to get DSpace working on Ubuntu 20.04 for our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a></li>
</ul>
<h2 id="2021-04-21">2021-04-21</h2>
<ul>
<li>Send Abdullah feedback on the <a href="https://github.com/ilri/OpenRXV/pull/91">filter on click pull request</a> for OpenRXV
<ul>
<li>I see it adds a new &ldquo;allow filter on click&rdquo; checkbox in the layout settings, but it doesn&rsquo;t modify the filters</li>
<li>Also, it seems to have broken the existing clicking of the countries on the map</li>
</ul>
</li>
<li>Atmire recently sent feedback about the CUA duplicates processor
<ul>
<li>Last month when I ran it it got stuck on the storage reports, apparently, so I will try again (with a fresh Solr statistics core from production) and skip the storage reports (<code>-g</code>):</li>
</ul>
</li>
</ul>
<pre tabindex="0"><code>$ export JAVA_OPTS=&#39;-Dfile.encoding=UTF-8 -Xmx2048m&#39;
$ cp atmire-cua-update.xml-20210124-132112.old /home/dspacetest.cgiar.org/config/spring/api/atmire-cua-update.xml
$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -r 100 -c statistics -t 12 -g
</code></pre><ul>
<li>The first run processed 1,439 docs, the second run processed 0 docs
<ul>
<li>I&rsquo;m not sure if that means that it worked? I sent feedback to Atmire</li>
</ul>
</li>
<li>Meeting with Moayad to discuss OpenRXV development progress</li>
</ul>
<h2 id="2021-04-25">2021-04-25</h2>
<ul>
<li>The indexes on AReS are messed up again
<ul>
<li>I made a backup of the indexes, then deleted the <code>openrxv-items-final</code> and <code>openrxv-items-temp</code> indexes, re-created the <code>openrxv-items</code> alias, and restored the data into <code>openrxv-items</code>:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --type<span style="color:#f92672">=</span>mapping
</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --output<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --limit<span style="color:#f92672">=</span><span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-temp&#39;</span>
</span></span><span style="display:flex;"><span>$ curl -XDELETE <span style="color:#e6db74">&#39;http://localhost:9200/openrxv-items-final&#39;</span>
</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_mapping.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items-final --type<span style="color:#f92672">=</span>mapping
</span></span><span style="display:flex;"><span>$ curl -s -X POST <span style="color:#e6db74">&#39;http://localhost:9200/_aliases&#39;</span> -H <span style="color:#e6db74">&#39;Content-Type: application/json&#39;</span> -d<span style="color:#e6db74">&#39;{&#34;actions&#34; : [{&#34;add&#34; : { &#34;index&#34; : &#34;openrxv-items-final&#34;, &#34;alias&#34; : &#34;openrxv-items&#34;}}]}&#39;</span>
</span></span><span style="display:flex;"><span>$ elasticdump --input<span style="color:#f92672">=</span>/home/aorth/openrxv-items_data.json --output<span style="color:#f92672">=</span>http://localhost:9200/openrxv-items --limit <span style="color:#ae81ff">1000</span> --type<span style="color:#f92672">=</span>data
</span></span></code></pre></div><ul>
<li>Then I started a fresh AReS harvest</li>
</ul>
<h2 id="2021-04-26">2021-04-26</h2>
<ul>
<li>The AReS harvest last night seems to have finished successfully and the number of items looks good:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s http://localhost:9200/_cat/indices | grep openrxv-items
</span></span><span style="display:flex;"><span>yellow open openrxv-items-temp H-CGsyyLTaqAj6-nKXZ-7w 1 1 0 0 283b 283b
</span></span><span style="display:flex;"><span>yellow open openrxv-items-final ul3SKsa7Q9Cd_K7qokBY_w 1 1 103951 0 254mb 254mb
</span></span></code></pre></div><ul>
<li>And the aliases seem correct for once:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">&#39;http://localhost:9200/_alias/&#39;</span> | python -m json.tool
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span> &#34;openrxv-items-final&#34;: {
</span></span><span style="display:flex;"><span> &#34;aliases&#34;: {
</span></span><span style="display:flex;"><span> &#34;openrxv-items&#34;: {}
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> },
</span></span><span style="display:flex;"><span> &#34;openrxv-items-temp&#34;: {
</span></span><span style="display:flex;"><span> &#34;aliases&#34;: {}
</span></span><span style="display:flex;"><span> },
</span></span><span style="display:flex;"><span>...
</span></span></code></pre></div><ul>
<li>That&rsquo;s 250 new items in the index since the last harvest!</li>
<li>Re-create my local Artifactory container because I&rsquo;m getting errors starting it and it has been a few months since it was updated:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ podman rm artifactory
</span></span><span style="display:flex;"><span>$ podman pull docker.bintray.io/jfrog/artifactory-oss:latest
</span></span><span style="display:flex;"><span>$ podman create --ulimit nofile<span style="color:#f92672">=</span>32000:32000 --name artifactory -v artifactory_data:/var/opt/jfrog/artifactory -p 8081-8082:8081-8082 docker.bintray.io/jfrog/artifactory-oss
</span></span><span style="display:flex;"><span>$ podman start artifactory
</span></span></code></pre></div><ul>
<li>Start testing DSpace 7.0 Beta 5 so I can evaluate if it solves some of the problems we are having on DSpace 6, and if it&rsquo;s missing things like multiple handle resolvers, etc
<ul>
<li>I see it needs Java JDK 11, Tomcat 9, Solr 8, and PostgreSQL 11</li>
<li>Also, according to the <a href="https://wiki.lyrasis.org/display/DSDOC7x/Installing+DSpace">installation notes</a> I see you can install the old DSpace 6 REST API, so that&rsquo;s potentially useful for us</li>
<li>I see that all web applications on the backend are now rolled into just one &ldquo;server&rdquo; application</li>
<li>The build process took 11 minutes the first time (due to downloading the world with Maven) and ~2 minutes the second time</li>
<li>The <code>local.cfg</code> content and syntax is very similar DSpace 6</li>
</ul>
</li>
<li>I got the basic <code>fresh_install</code> up and running
<ul>
<li>Then I tried to import a DSpace 6 database from production</li>
</ul>
</li>
<li>I tried to delete all the Atmire SQL migrations:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7b5= &gt; DELETE FROM schema_version WHERE description LIKE &#39;%Atmire%&#39; OR description LIKE &#39;%CUA%&#39; OR description LIKE &#39;%cua%&#39;;
</span></span></code></pre></div><ul>
<li>But I got an error when running <code>dspace database migrate</code>:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ~/dspace7b5/bin/dspace database migrate
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Database URL: jdbc:postgresql://localhost:5432/dspace7b5
</span></span><span style="display:flex;"><span>Migrating database to latest version... (Check dspace logs for details)
</span></span><span style="display:flex;"><span>Migration exception:
</span></span><span style="display:flex;"><span>java.sql.SQLException: Flyway migration error occurred
</span></span><span style="display:flex;"><span> at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:738)
</span></span><span style="display:flex;"><span> at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:632)
</span></span><span style="display:flex;"><span> at org.dspace.storage.rdbms.DatabaseUtils.main(DatabaseUtils.java:228)
</span></span><span style="display:flex;"><span> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
</span></span><span style="display:flex;"><span> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
</span></span><span style="display:flex;"><span> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
</span></span><span style="display:flex;"><span> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
</span></span><span style="display:flex;"><span> at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:273)
</span></span><span style="display:flex;"><span> at org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:129)
</span></span><span style="display:flex;"><span> at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:94)
</span></span><span style="display:flex;"><span>Caused by: org.flywaydb.core.api.FlywayException: Validate failed:
</span></span><span style="display:flex;"><span>Detected applied migration not resolved locally: 5.0.2017.09.25
</span></span><span style="display:flex;"><span>Detected applied migration not resolved locally: 6.0.2017.01.30
</span></span><span style="display:flex;"><span>Detected applied migration not resolved locally: 6.0.2017.09.25
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span> at org.flywaydb.core.Flyway.doValidate(Flyway.java:292)
</span></span><span style="display:flex;"><span> at org.flywaydb.core.Flyway.access$100(Flyway.java:73)
</span></span><span style="display:flex;"><span> at org.flywaydb.core.Flyway$1.execute(Flyway.java:166)
</span></span><span style="display:flex;"><span> at org.flywaydb.core.Flyway$1.execute(Flyway.java:158)
</span></span><span style="display:flex;"><span> at org.flywaydb.core.Flyway.execute(Flyway.java:527)
</span></span><span style="display:flex;"><span> at org.flywaydb.core.Flyway.migrate(Flyway.java:158)
</span></span><span style="display:flex;"><span> at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:729)
</span></span><span style="display:flex;"><span> ... 9 more
</span></span></code></pre></div><ul>
<li>I deleted those migrations:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace7b5= &gt; DELETE FROM schema_version WHERE version IN (&#39;5.0.2017.09.25&#39;, &#39;6.0.2017.01.30&#39;, &#39;6.0.2017.09.25&#39;);
</span></span></code></pre></div><ul>
<li>Then when I ran the migration again it failed for a new reason, related to the configurable workflow:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Database URL: jdbc:postgresql://localhost:5432/dspace7b5
</span></span><span style="display:flex;"><span>Migrating database to latest version... (Check dspace logs for details)
</span></span><span style="display:flex;"><span>Migration exception:
</span></span><span style="display:flex;"><span>java.sql.SQLException: Flyway migration error occurred
</span></span><span style="display:flex;"><span> at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:738)
</span></span><span style="display:flex;"><span> at org.dspace.storage.rdbms.DatabaseUtils.updateDatabase(DatabaseUtils.java:632)
</span></span><span style="display:flex;"><span> at org.dspace.storage.rdbms.DatabaseUtils.main(DatabaseUtils.java:228)
</span></span><span style="display:flex;"><span> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
</span></span><span style="display:flex;"><span> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
</span></span><span style="display:flex;"><span> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
</span></span><span style="display:flex;"><span> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
</span></span><span style="display:flex;"><span> at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:273)
</span></span><span style="display:flex;"><span> at org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:129)
</span></span><span style="display:flex;"><span> at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:94)
</span></span><span style="display:flex;"><span>Caused by: org.flywaydb.core.internal.command.DbMigrate$FlywayMigrateException:
</span></span><span style="display:flex;"><span>Migration V7.0_2019.05.02__DS-4239-workflow-xml-migration.sql failed
</span></span><span style="display:flex;"><span>--------------------------------------------------------------------
</span></span><span style="display:flex;"><span>SQL State : 42P01
</span></span><span style="display:flex;"><span>Error Code : 0
</span></span><span style="display:flex;"><span>Message : ERROR: relation &#34;cwf_pooltask&#34; does not exist
</span></span><span style="display:flex;"><span> Position: 8
</span></span><span style="display:flex;"><span>Location : org/dspace/storage/rdbms/sqlmigration/postgres/V7.0_2019.05.02__DS-4239-workflow-xml-migration.sql (/home/aorth/src/apache-tomcat-9.0.45/file:/home/aorth/dspace7b5/lib/dspace-api-7.0-beta5.jar!/org/dspace/storage/rdbms/sqlmigration/postgres/V7.0_2019.05.02__DS-4239-workflow-xml-migration.sql)
</span></span><span style="display:flex;"><span>Line : 16
</span></span><span style="display:flex;"><span>Statement : UPDATE cwf_pooltask SET workflow_id=&#39;defaultWorkflow&#39; WHERE workflow_id=&#39;default&#39;
</span></span><span style="display:flex;"><span>...
</span></span></code></pre></div><ul>
<li>The <a href="https://wiki.lyrasis.org/display/DSDOC7x/Upgrading+DSpace">DSpace 7 upgrade docs</a> say I need to apply these previously optional migrations:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ~/dspace7b5/bin/dspace database migrate ignored
</span></span></code></pre></div><ul>
<li>Now I see all migrations have completed and DSpace actually starts up fine!</li>
<li>I will try to do a full re-index to see how long it takes:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time ~/dspace7b5/bin/dspace index-discovery -b
</span></span><span style="display:flex;"><span>...
</span></span><span style="display:flex;"><span>~/dspace7b5/bin/dspace index-discovery -b 25156.71s user 64.22s system 97% cpu 7:11:09.94 total
</span></span></code></pre></div><ul>
<li>Not good, that shit took almost seven hours!</li>
</ul>
<h2 id="2021-04-27">2021-04-27</h2>
<ul>
<li>Peter sent me a list of 500+ DOIs from CGSpace with no Altmetric score
<ul>
<li>I used csvgrep (with Windows encoding!) to extract those without our handle and save the DOIs to a text file, then got their handles with my <code>doi-to-handle.py</code> script:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -e <span style="color:#e6db74">&#39;windows-1252&#39;</span> -c <span style="color:#e6db74">&#39;Handle.net IDs&#39;</span> -i -m <span style="color:#e6db74">&#39;10568/&#39;</span> ~/Downloads/Altmetric<span style="color:#ae81ff">\ </span>-<span style="color:#ae81ff">\ </span>Research<span style="color:#ae81ff">\ </span>Outputs<span style="color:#ae81ff">\ </span>-<span style="color:#ae81ff">\ </span>CGSpace<span style="color:#ae81ff">\ </span>-<span style="color:#ae81ff">\ </span>2021-04-26.csv | csvcut -c DOI | sed <span style="color:#e6db74">&#39;1d&#39;</span> &gt; /tmp/dois.txt
</span></span><span style="display:flex;"><span>$ ./ilri/doi-to-handle.py -i /tmp/dois.txt -o /tmp/handles.csv -db dspace63 -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -d
</span></span></code></pre></div><ul>
<li>He will Tweet them&hellip;</li>
</ul>
<h2 id="2021-04-28">2021-04-28</h2>
<ul>
<li>Grant some IWMI colleagues access to the Atmire Content and Usage stats on CGSpace</li>
</ul>
<!-- raw HTML omitted -->
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2022-06/">June, 2022</a></li>
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
<li><a href="/cgspace-notes/2022-02/">February, 2022</a></li>
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p dir="auto">
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>